Re: select within aggregate?

From: Bruno Wolff III <bruno(at)wolff(dot)to>
To: Vortex <vortex25(at)gmx(dot)de>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: select within aggregate?
Date: 2005-05-06 12:49:23
Message-ID: 20050506124923.GB14417@wolff.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Fri, May 06, 2005 at 09:25:35 +0200,
Vortex <vortex25(at)gmx(dot)de> wrote:
> Hi!!
>
> Of course the following is possible in various ways but i wonder
> if there is an elegant and simple solution:
>
> A part from apache log:
> remote_host varchar(50)
> request_uri varchar(50)
> ts timestamp with time zone
>
> Now i want to see which host has done how many requests. This
> is not difficult:
>
> SELECT remote_host, count(*) FROM apache_log GROUP BY remote_host;
>
> But now i would like to add a column to the result which
> shows the most recent requested uri for each group.
> This sounds quite easy at first glance but i see no simple solution.
>
> I suppose with max(ts) i could acquire the time of the
> most recent request within the group, but how do i get
> the corresponding request_uri without doing subquerys or
> something like that?

You could use max(ts) as long as you can have two requests occur at the
same time. This might be a reasonable assumption, but it might have been
better to include a sequence so that each record would be guaranteed to
have a unique key.

Using DISTINCT ON to get the latest URIs is probably better. You can
join that query to the one doing the count. This will probably be more
efficient than executing a subquery for each remote host.

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Joel Fradkin 2005-05-06 13:50:06 encoding
Previous Message Kenneth Gonsalves 2005-05-06 09:14:29 pg_dump without data