Re: select top N entries from several groups (Modified by David Orme)

From: David Orme <d(dot)orme(at)imperial(dot)ac(dot)uk>
To: Sean Davis <sdavis2(at)mail(dot)nih(dot)gov>
Subject: Re: select top N entries from several groups (Modified by David Orme)
Date: 2005-04-12 14:25:51
Message-ID: 1523684f3c879290c184392856a86b08@ic.ac.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

[Forgot to send this to the list address rather than Sean's. Apologies.]

Hi Sean,

Many thanks for that. If I understand the syntax correctly, the
subquery is executed for each row and checks to see if the current val
is in the top N list for the gp of that row.

My problem is that the real life example is on a table of 54,720 rows.
A single run of the subquery takes 67ms - so the whole query takes
something like an hour [(0.067*54720)/(60*60)]. At least, it certainly
takes a long time and that is my interpretation. My suspicion is that
there isn't any way round this - I can have elegant, short, slower SQL
or write the subquery for each group independently - faster but not
nearly so pleasing. Fortunately the real number of groups is fixed and
small (8) so this is tractable.

Is this right?

Thanks,
David

On 12 Apr 2005, at 13:21, Sean Davis wrote:

>
> On Apr 12, 2005, at 7:21 AM, David Orme wrote:
>
>> Hi,
>>
>> Suppose I have a table (called temp) like this:
>>
>> gp val
>> A 10
>> A 8
>> A 6
>> A 4
>> B 3
>> B 2
>> B 1
>> B 0
>>
>> How can I get the largest two values for each group in a single pass?
>> I want to end up with:
>>
>> gp val
>> A 10
>> A 8
>> B 3
>> B 2
>>
>> I can do this a group at a time using...
>
> How about:
>
> create table temp (
> gp char,
> val int
> );
>
> insert into temp values ('A',10);
> insert into temp values ('A',8);
> insert into temp values ('A',6);
> insert into temp values ('A',4);
> insert into temp values ('B',3);
> insert into temp values ('B',2);
> insert into temp values ('B',1);
>
> select a.gp,a.val
> from temp a
> where a.val in (
> select b.val
> from temp b
> where a.gp=b.gp
> order by b.val desc
> limit 2);
>
> gp | val
> ----+-----
> A | 10
> A | 8
> B | 3
> B | 2
> (4 rows)
>
> I have found this link is useful for beginning to think about
> subqueries:
>
> http://www.postgresql.org/files/documentation/books/aw_pgsql/
> node81.html
>
> Sean
>

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Cook, Larry ISC 2005-04-12 15:23:36 FW: Creating DataBases on Suse LINUX Platform
Previous Message Bob Talbot 2005-04-12 14:17:41 Windows Install