From: | Ken Tanzer <ken(dot)tanzer(at)gmail(dot)com> |
---|---|
To: | Viktor Fougstedt <viktor(at)chalmers(dot)se> |
Cc: | PG-General Mailing List <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Given a set of daterange, finding the continuous range that includes a particular date |
Date: | 2018-02-24 01:07:19 |
Message-ID: | CAD3a31VXGX-o=9OHEHgD7Ou0iwHnZ6Bnd-dm8FDkokL9-tN8Tw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Fri, Feb 23, 2018 at 6:42 AM, Viktor Fougstedt <viktor(at)chalmers(dot)se>
wrote:
>
> Hi.
>
> This answer is perhaps useful if I understand your problem correctly. But
> I might have interpreted it wrongly. :-)
>
>
> I would probably start with merging intervals so that overlapping and
> adjacent intervals become single continuous intervals, then select from
> those merged intervals.
>
> We have an application with a lot of interval handling in PostgreSQL, and
> we use many variants of algorithms based on one by Itzik Ben Gan which he
> calls “packing intervals”. The post we started with was an old reader’s
> challenge from SQL Magazine. It has since been updated when MS-SQL started
> supporting window functions better (Itzik Ben Gan is a MS-SQL-guy).
>
> Basically, it is a few CTE:s which convert the intervals into “start” (+1)
> and “stop” (-1) events, then keeps a running sum of these, and finally
> creates new intervals by noting that the merged (or “packed”) intervals
> starts with events that had sum=0 before them and stops with events which
> have sum=0 after them.
>
> It involves both CTE:s and window functions and is quite a beautiful
> example of SQL, IMHO.
>
> I think it’s best to google it, he does a good job of explaining how it
> works.
>
> Hope that helps a bit at least.
>
> /Viktor
>
> Hi. Thanks for the many helpful responses! Although I didn't end up with
exactly what I was looking for, I think I got to something that works good
enough for now. I did it with CTEs, in a way I think similar to what you
were suggesting (my printer is out of ink, so I didn't actually get to look
at that book yet!) I ended up having to do 4 passes:
1) Identify the starts and ends of continuous ranges
2) Eliminate the middle-point records (non stops/starts)
3) Merge the stop date in with the starts
4) Eliminate the stops
I couldn't see how to do it in less steps, but if there's a way...
I posted the query along with some sample data below.
Thanks again!
Ken
BEGIN;
CREATE TEMP TABLE sample_data (
client_id INTEGER,
start_date DATE,
end_date DATE,
episode INTEGER -- I won't actually have this info; just a label I added
for clarity and testing
);
INSERT INTO sample_data VALUES
(1,'1990-01-01','1990-12-31',0),
(1,'1991-01-01','1991-12-31',0),
(1,'1995-01-01','1995-06-30',1),
(1,'2000-01-01','2000-12-31',2),
(1,'2001-01-01','2001-12-31',2),
(1,'2002-01-01','2002-12-31',2),
(1,'2003-01-01','2003-12-31',2),
(1,'2004-01-01','2004-12-31',2),
(1,'2005-01-01','2005-12-31',2),
(1,'2006-01-01','2006-12-31',2),
(1,'2014-01-01','2014-12-31',3),
(1,'2015-01-01','2015-12-31',3),
(1,'2017-06-30','2017-12-31',4),
(1,'2018-01-01',NULL,4),
(2,'2014-02-01','2015-01-31',0),
(2,'2015-02-01','2015-12-31',0),
(2,'2017-09-30','2018-01-31',1),
(2,'2018-02-01','2018-02-14',1)
;
WITH peek3 AS (
WITH peek2 AS (
WITH peek AS (
SELECT
client_id,
episode,
daterange(start_date,end_date,'[]') AS range,
COALESCE(daterange(start_date,end_date+1,'[]') &&
lead(daterange(start_date,end_date,'[]')) OVER (PARTITION BY client_id
ORDER BY start_date),false) AS continues,
COALESCE(daterange(start_date,end_date,'[]') &&
lag(daterange(start_date,end_date+1,'[]')) OVER (PARTITION BY client_id
ORDER BY start_date),false) AS is_continued
FROM
sample_data
)
SELECT
*
FROM peek
WHERE NOT (is_continued AND continues)
)
SELECT client_id,episode,range,
daterange(lower(range),
CASE WHEN lead(is_continued) OVER (PARTITION BY client_id ORDER BY range)
THEN
lead(upper(range)) OVER (PARTITION BY client_id ORDER BY range)
ELSE upper(range)
END) AS full_range
,continues,is_continued
FROM peek2
)
SELECT * FROM peek3
WHERE NOT is_continued
;
--
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ <http://agency-software.org/>*
*https://demo.agency-software.org/client
<https://demo.agency-software.org/client>*
ken(dot)tanzer(at)agency-software(dot)org
(253) 245-3801
Subscribe to the mailing list
<agency-general-request(at)lists(dot)sourceforge(dot)net?body=subscribe> to
learn more about AGENCY or
follow the discussion.
From | Date | Subject | |
---|---|---|---|
Next Message | Ken Tanzer | 2018-02-24 02:11:51 | Re: Given a set of daterange, finding the continuous range that includes a particular date (aggregates) |
Previous Message | Thomas Munro | 2018-02-23 23:19:29 | Re: Query error: could not resize shared memory segment |