SQLite Forum

Count distinct and window function
Login
Help with window function

I am trying to count the number of distinct patients so far for any given day.

Here is a simplified example of what I am trying to achieve:

`create table visit(datestring,timestring,patient);`

`insert into visit(datestring,timestring,patient) values ('2021-09-01','09:00','A');`

`insert into visit(datestring,timestring,patient) values ('2021-09-01','12:00','A');`

`insert into visit(datestring,timestring,patient) values ('2021-09-01','10:00','B');`

`insert into visit(datestring,timestring,patient) values ('2021-09-01','11:00','B');`

`insert into visit(datestring,timestring,patient) values ('2021-09-02','09:00','D');`

`insert into visit(datestring,timestring,patient) values ('2021-09-02','12:00','C');`

`insert into visit(datestring,timestring,patient) values ('2021-09-02','10:00','B');`

`insert into visit(datestring,timestring,patient) values ('2021-09-02','13:00','B');`

which gives:

`select * from visit order by datestring,timestring;`

| datestring | timestring | patient |
------------------------------------
| 2021-09-01 | 09:00 | A |
| 2021-09-01 | 10:00 | B |
| 2021-09-01 | 11:00 | B |
| 2021-09-01 | 12:00 | A |
| 2021-09-02 | 09:00 | D |
| 2021-09-02 | 10:00 | B |
| 2021-09-02 | 12:00 | C |
| 2021-09-02 | 13:00 | B |

The closest I could go to the expected result is here:

`select datestring,timestring,patient,(case when count() over (partition by datestring,patient order by timestring rows between unbounded preceding and current row) > 1 then 0 else 1 end) as distinct_count from visit order by datestring,timestring;`

which gives:

| datestring | timestring | patient | distinct_count |
-----------------------------------------------------
| 2021-09-01 | 09:00 | A | 1 |
| 2021-09-01 | 10:00 | B | 1 |
| 2021-09-01 | 11:00 | B | 0 |
| 2021-09-01 | 12:00 | A | 0 |
| 2021-09-02 | 09:00 | D | 1 |
| 2021-09-02 | 10:00 | B | 1 |
| 2021-09-02 | 12:00 | C | 1 |
| 2021-09-02 | 13:00 | B | 0 |


# How do I print the sum of the distinct_count over all the preceding rows which have the same datestring (including the current row)? It would this:


| datestring | timestring | patient | distinct_count | what_I_want |
-------------------------------------------------------------------
| 2021-09-01 | 09:00 | A | 1 | 1 |
| 2021-09-01 | 10:00 | B | 1 | 2 |
| 2021-09-01 | 11:00 | B | 0 | 2 |
| 2021-09-01 | 12:00 | A | 0 | 2 |
| 2021-09-02 | 09:00 | D | 1 | 1 |
| 2021-09-02 | 10:00 | B | 1 | 2 |
| 2021-09-02 | 12:00 | C | 1 | 3 |
| 2021-09-02 | 13:00 | B | 0 | 3 |

I have spent a lot of time trying to figure this out. Any help or pointer would be greatly appreciated.
Thank you.