From: | Bruno Wolff III <bruno(at)wolff(dot)to> |
---|---|
To: | Vortex <vortex25(at)gmx(dot)de> |
Cc: | pgsql-sql(at)postgresql(dot)org |
Subject: | Re: select within aggregate? |
Date: | 2005-05-06 12:49:23 |
Message-ID: | 20050506124923.GB14417@wolff.to |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
On Fri, May 06, 2005 at 09:25:35 +0200,
Vortex <vortex25(at)gmx(dot)de> wrote:
> Hi!!
>
> Of course the following is possible in various ways but i wonder
> if there is an elegant and simple solution:
>
> A part from apache log:
> remote_host varchar(50)
> request_uri varchar(50)
> ts timestamp with time zone
>
> Now i want to see which host has done how many requests. This
> is not difficult:
>
> SELECT remote_host, count(*) FROM apache_log GROUP BY remote_host;
>
> But now i would like to add a column to the result which
> shows the most recent requested uri for each group.
> This sounds quite easy at first glance but i see no simple solution.
>
> I suppose with max(ts) i could acquire the time of the
> most recent request within the group, but how do i get
> the corresponding request_uri without doing subquerys or
> something like that?
You could use max(ts) as long as you can have two requests occur at the
same time. This might be a reasonable assumption, but it might have been
better to include a sequence so that each record would be guaranteed to
have a unique key.
Using DISTINCT ON to get the latest URIs is probably better. You can
join that query to the one doing the count. This will probably be more
efficient than executing a subquery for each remote host.
From | Date | Subject | |
---|---|---|---|
Next Message | Joel Fradkin | 2005-05-06 13:50:06 | encoding |
Previous Message | Kenneth Gonsalves | 2005-05-06 09:14:29 | pg_dump without data |