Re: SELECT DISTINCT too slow

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Miroslav ?ulc <miroslav(dot)sulc(at)startnet(dot)cz>, pgsql-sql(at)postgresql(dot)org
Subject: Re: SELECT DISTINCT too slow
Date: 2006-06-01 15:46:28
Message-ID: 87k680hoa3.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:

> Miroslav ?ulc wrote:
> > Well, "key" is not primary key from another table. It is just a column
> > in pair "key" => "value".
> > The structure of the table is this:
> >
> > Id (primary key)
> > MRTPContactId (id of contact from table MRTPContact)
> > Key (key from pair key => value)
> > Value (value from pair key => value)
> >
> > So I want the get the list of keys used in the table.
>
> The plan you get is the most efficient possible for that query. If you
> had a table of possible keys (which should of course be FK of "Key"),
> you could get a much faster version :-)

Actually you could try the equivalent query:

SELECT Key FROM MRTPContactValue GROUP BY Key

This may or may not be faster because it can use a hash aggregate plan. I
would expect it to be faster here because there are few distinct keys and the
planner predicts that.

Eventually these two queries should be handled the same by Postgres but Hash
Aggregates are a new addition and DISTINCT/DISTINCT ON hasn't been adapted to
make use of them.

Also, incidentally, I don't see how a table of possible keys could help you
here. Nothing forces they table MRTPContactValue to use all possible keys...

--
greg

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Miroslav Šulc 2006-06-01 16:06:53 Re: SELECT DISTINCT too slow
Previous Message Alvaro Herrera 2006-06-01 15:46:13 Re: SELECT DISTINCT too slow