Re: get distinct + group by then filter

From: Filip Rembiałkowski <plk(dot)zuber(at)gmail(dot)com>
To: Ivan Sergio Borgonovo <mail(at)webthatworks(dot)it>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: get distinct + group by then filter
Date: 2009-12-18 12:20:00
Message-ID: 92869e660912180420h548ec7dak2f85f0780a8d0318@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

2009/12/17 Ivan Sergio Borgonovo <mail(at)webthatworks(dot)it>

> I've a web application and I'm trying to do some reporting on
> affiliate commission
>
> create table tracky_hit (
> hitid serial,
> esid varchar(32), -- related to browser session
> track_time timestamp,
> aid varchar(32), -- affiliate code
> -- some other tracking stuff
> );
>
> create table tracky_event (
> eventid serial,
> esid varchar(32) references tracky_hit (esid)
>

This imples that tracky_hit.esid is at least UNIQUE.

);
>
> create table tracky_ordergroup_event (
> ordergroupid int references ...,
> eventid int references tracky_event (eventid)
> );
>
> Now I'd like to pick up the first hit for each esid in a given
> interval of time for a given aid and relate them with ordergroupid.
>
> aid may change across the same esid.
>

If tracky_hit.esid is unique, then why same esid can have many aids?

Can you specify more complete schema (at least PKeys would be nice)?

> Getting the first hit for each esid can be done:
>
> select min(hitid) as h
> from tracky_hit
> group by esid;
>
> or
>
> select distinct on (esid) hitid
> from tracky_hit
> order by esid, track_time;
>

DISTINCT ON seems a good aproach tu such queries.

>
> If I put a where aid='somestuff' right in the above query... I'm not
> picking up the first hit in an esid.
>
> The only way that comes to my mind to solve the problem is applying
> the condition later in a subquery, but no conditions means a lot of
> data returned.
>
> I've a similar problem with the interval: if I chop in the middle of
> a session I may not pick up the beginning of each session.
> Furthermore I've to count session just once even if they cross the
> boundary of an interval.
>
> I could do something like:
>
> select oe.ordergroupid from
> tracky_ordergroup_event oe
> join tracky_event e on e.eventid=oe.eventid
> join tracky_hit th on th.esid=e.esid
> where th.hitid in
> (select distinct on (esid) hitid
> from tracky_hit
> where track_time between
> ('2009-12-01'::timestamp - interval '1 days')
> and
> ('2009-12-01'::timestamp + interval '1 months' + interval '1
> days')
> order by esid, track_time
> )
> and th.aid='someaid'
> and th.track_time between
> ('2009-12-01'::timestamp)
> and
> ('2009-12-01'::timestamp + interval '1 months');
>
> but this looks awful. Any better way?
>
> I'm on 8.3 and no short term plan to move to 8.4
>
> thanks
>
> --
> Ivan Sergio Borgonovo
> http://www.webthatworks.it
>
>
> --
> Sent via pgsql-sql mailing list (pgsql-sql(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-sql
>

--
Filip Rembiałkowski
JID,mailto:filip(dot)rembialkowski(at)gmail(dot)com
http://filip.rembialkowski.net/

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Ivan Sergio Borgonovo 2009-12-18 13:44:59 Re: get distinct + group by then filter
Previous Message Ivan Sergio Borgonovo 2009-12-17 15:28:09 get distinct + group by then filter