From: | Chris Curvey <chris(at)chriscurvey(dot)com> |
---|---|
To: | Steve Crawford <scrawford(at)pinpointresearch(dot)com> |
Cc: | "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Statistics query |
Date: | 2013-04-10 17:31:49 |
Message-ID: | CADfwSsCAu53117whajG3+x0Eaf_DHbS3KrnQbSWjTV_5Nv_G5A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Wed, Apr 10, 2013 at 12:30 PM, Steve Crawford <
scrawford(at)pinpointresearch(dot)com> wrote:
> I'm seeking ideas on the best way to craft the following query. I've
> stripped everything down to the bare essentials and simplified it below.
>
> Input data has a timestamp (actually an int received from the system in
> the form of a Unix epoch), a unit identifier and a status:
>
> event_time | unit_id | status
> ------------+---------+-------**-
> 1357056011 | 60 | 1
> 1357056012 | 178 | 0
> 1357056019 | 168 | 0
> 1357056021 | 3 | 0
> 1357056021 | 4 | 1
> 1357056021 | 179 | 0
> 1357056022 | 0 | 1
> 1357056022 | 1 | 0
> 1357056023 | 2 | 0
> 1357056024 | 9 | 0
> 1357056025 | 5 | 0
> 1357056025 | 6 | 0
> 1357056026 | 7 | 1
> ...
>
> A given unit_id cannot have two events at the same time (enforced by
> constraints).
>
> Given a point in time I would like to:
>
> 1. Identify all distinct unit_ids with an entry that exists in the
> preceding hour then
>
> 2. Count both the total events and sum the status=1 events for the most
> recent 50 events for each unit_id that fall within a limited period (e.g.
> don't look at data earlier than midnight). So unit_id 60 might have 50
> events in the last 15 minutes while unit_id 4 might have only 12 events
> after midnight.
>
> The output would look something like:
>
> unit_id | events | status_1_count
> ---------+--------+-----------**-----
> 1 | 50 | 34
> 2 | 27 | 18
> 1 | 50 | 34
> 1 | 2 | 0
> ...
>
> Each sub-portion is easy and while I could use external processing or
> set-returning functions I was hoping first to find the secret-sauce to glue
> everything together into a single query.
>
> Cheers,
> Steve
>
>
> something like
select unit_id, count(*), sum(status)
from mytable a
where event_time >= [whatever unix epoch translates to "last midnight"]
and exists
( select *
from mytable b
where b.unit_id = a.unit_id
and b.epoch >= [unix epoch that translates to "one hour ago"])
group by unit _id;
1) I think figuring out the unix epoch should be reasonable...but I don't
know how to do it off the top of my head.
2) I could completely be misunderstanding this. I'm not sure why the
example results would have unit id 1 repeated. (which my suggestion WON'T
do)
From | Date | Subject | |
---|---|---|---|
Next Message | Steve Crawford | 2013-04-10 17:51:52 | Re: Statistics query |
Previous Message | Steve Crawford | 2013-04-10 16:30:55 | Statistics query |