Re: Group by range in hour of day

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Israel Brewster <israel(at)ravnalaska(dot)net>
Cc: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, "pgsql-general(at)postgresql(dot)org general" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Group by range in hour of day
Date: 2015-03-16 23:57:06
Message-ID: CAKFQuwZCp_JkgbhMaMWCutw=3uLur2-TE-Dbjm46GO_28PP_=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, Mar 16, 2015 at 4:16 PM, Israel Brewster <israel(at)ravnalaska(dot)net>
wrote:

> On Mar 16, 2015, at 2:22 PM, David G. Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>
> wrote:
>
>
> On Mon, Mar 16, 2015 at 3:13 PM, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
> wrote:
>
>> On 03/16/2015 02:57 PM, Israel Brewster wrote:
>>
>>> I have a table with two timestamp columns for the start time and end
>>> time of each record (call them start and end).I'm trying to figure out
>>> if there is a way to group these records by "hour of day", that is the
>>> record should be included in the group if the hour of the day for the
>>> group falls anywhere in the range [start,end]. Obviously each record may
>>> well fall into multiple groups under this scenario.
>>>
>>> The goal here is to figure out, for each hour of the day, a) what is the
>>> total number of "active" records for that hour, and b) what is the total
>>> "active" time for those records during the hour, with an ultimate goal
>>> of figuring out the average active time per record per hour.
>>>
>>> So, for simplified example, if the table contained three records:
>>>
>>> start | end
>>> -----------------------------------------------------
>>> 2015-03-15 08:15 | 2015-03-15 10:45
>>> 2015-03-15 09:30 | 2015-03-15 10:15
>>> 2015-03-15 10:30 | 2015-03-15 11:30
>>>
>>>
>>> Then the results should break out something like this:
>>>
>>> hour | count | sum
>>> -----------------------------
>>> 8 | 1 | 0.75
>>> 9 | 2 | 1.5
>>> 10 | 3 | 1.5
>>> 11 | 1 | 0.5
>>>
>>> I can then easily manipulate these values to get my ultimate goal of the
>>> average, which would of course always be less than or equal to 1. Is
>>> this doable in postgress? Or would it be a better idea to simply pull
>>> the raw data and post-process in code? Thanks.
>>>
>>
>> Do not have an answer for you, but a question:
>>
>> What version of Postgres are you on?
>>
>> This will help determine what tools are available to work with.
>
>
> ​The following will give you endpoints for your bounds. Version is
> important since "range types" could be very useful in this situation - but
> you'd still need to generate the bounds info regardless.​
>
> ​
> SELECT *
> FROM
> (SELECT * FROM generate_series('2015-03-15'::timestamptz,
> '2015-03-16'::timestamptz, '1 hour'::interval) start (start_ts)) AS s
> CROSS JOIN
> (SELECT end_ts + '1 hour'::interval AS end_ts FROM
> generate_series('2015-03-15'::timestamptz, '2015-03-16'::timestamptz, '1
> hour'::interval) e (end_ts)) AS e
>
> You would join this using an ON condition with an OR (start BETWEEN [...]
> OR end BETWEEN [...]) - range logic will be better and you may want to
> adjust the upper bound by negative 1 (nano-second?) to allow for easier
> "<=" logic if using BETWEEN.
>
>
> Thanks, that is very helpful, but are you sure CROSS JOIN is what you
> wanted here? using that, I get a 625 row result set where each row from the
> first SELECT is paired up with EVERY row from the second select. I would
> think I would want the first row of the first SELECT paired up with only
> the first row of the second, second row of the first paired with the second
> row of the second, etc - i.e. 24 start and end bounds. Or am I missing
> something?
>

​No, I rushed things...:( Sorry. My concept is good though but indeed you
want to end up with a table having only 24 rows (for the sample).

LATERAL may work here but I haven't had a chance to play with it yet. A
simple ordinal column to join on would be sufficient.

David J.​

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Peter Eisentraut 2015-03-17 01:37:40 Re: pitr archive_command cp fsync
Previous Message Adrian Klaver 2015-03-16 23:46:33 Re: Group by range in hour of day