From: | Filip Rembiałkowski <plk(dot)zuber(at)gmail(dot)com> |
---|---|
To: | Ivan Sergio Borgonovo <mail(at)webthatworks(dot)it> |
Cc: | pgsql-sql(at)postgresql(dot)org |
Subject: | Re: get distinct + group by then filter |
Date: | 2009-12-18 12:20:00 |
Message-ID: | 92869e660912180420h548ec7dak2f85f0780a8d0318@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
2009/12/17 Ivan Sergio Borgonovo <mail(at)webthatworks(dot)it>
> I've a web application and I'm trying to do some reporting on
> affiliate commission
>
> create table tracky_hit (
> hitid serial,
> esid varchar(32), -- related to browser session
> track_time timestamp,
> aid varchar(32), -- affiliate code
> -- some other tracking stuff
> );
>
> create table tracky_event (
> eventid serial,
> esid varchar(32) references tracky_hit (esid)
>
This imples that tracky_hit.esid is at least UNIQUE.
);
>
> create table tracky_ordergroup_event (
> ordergroupid int references ...,
> eventid int references tracky_event (eventid)
> );
>
> Now I'd like to pick up the first hit for each esid in a given
> interval of time for a given aid and relate them with ordergroupid.
>
> aid may change across the same esid.
>
If tracky_hit.esid is unique, then why same esid can have many aids?
Can you specify more complete schema (at least PKeys would be nice)?
> Getting the first hit for each esid can be done:
>
> select min(hitid) as h
> from tracky_hit
> group by esid;
>
> or
>
> select distinct on (esid) hitid
> from tracky_hit
> order by esid, track_time;
>
DISTINCT ON seems a good aproach tu such queries.
>
> If I put a where aid='somestuff' right in the above query... I'm not
> picking up the first hit in an esid.
>
> The only way that comes to my mind to solve the problem is applying
> the condition later in a subquery, but no conditions means a lot of
> data returned.
>
> I've a similar problem with the interval: if I chop in the middle of
> a session I may not pick up the beginning of each session.
> Furthermore I've to count session just once even if they cross the
> boundary of an interval.
>
> I could do something like:
>
> select oe.ordergroupid from
> tracky_ordergroup_event oe
> join tracky_event e on e.eventid=oe.eventid
> join tracky_hit th on th.esid=e.esid
> where th.hitid in
> (select distinct on (esid) hitid
> from tracky_hit
> where track_time between
> ('2009-12-01'::timestamp - interval '1 days')
> and
> ('2009-12-01'::timestamp + interval '1 months' + interval '1
> days')
> order by esid, track_time
> )
> and th.aid='someaid'
> and th.track_time between
> ('2009-12-01'::timestamp)
> and
> ('2009-12-01'::timestamp + interval '1 months');
>
> but this looks awful. Any better way?
>
> I'm on 8.3 and no short term plan to move to 8.4
>
> thanks
>
> --
> Ivan Sergio Borgonovo
> http://www.webthatworks.it
>
>
> --
> Sent via pgsql-sql mailing list (pgsql-sql(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-sql
>
--
Filip Rembiałkowski
JID,mailto:filip(dot)rembialkowski(at)gmail(dot)com
http://filip.rembialkowski.net/
From | Date | Subject | |
---|---|---|---|
Next Message | Ivan Sergio Borgonovo | 2009-12-18 13:44:59 | Re: get distinct + group by then filter |
Previous Message | Ivan Sergio Borgonovo | 2009-12-17 15:28:09 | get distinct + group by then filter |