Re: sequential scan on select distinct

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, Pierre-Frédéric Caillaud <lists(at)boutiquenumerique(dot)com>, "Ole Langbehn" <ole(at)freiheit(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: sequential scan on select distinct
Date: 2004-10-06 20:02:22
Message-ID: 87sm8ry5rl.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Greg Stark <gsstark(at)mit(dot)edu> writes:
> > why isn't a "skip index scan" plan available? Well, nobody's written the code
> > yet.
>
> I don't really think it would be a useful plan anyway.

Well it would clearly be useful in this test case, where has a small number of
distinct values in a large table, and an index on the column. His plpgsql
function that emulates such a plan is an order of magnitude faster than the
hash aggregate plan even though it has to do entirely separate index scans for
each key value.

I'm not sure where the break-even point would be, but it would probably be
pretty low. Probably somewhere around the order of 1% distinct values in the
table. That might be uncommon, but certainly not impossible.

But regardless of how uncommon it is, it could be considered important in
another sense: when you need it there really isn't any alternative. It's an
algorithmic improvement with no bound on the performance difference. Nothing
short of using a manually maintained materialized view would bring the
performance into the same ballpark.

So even if it's only useful occasionally, not having the plan available can
leave postgres with no effective plan for what should be an easy query.

--
greg

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Doug Y 2004-10-06 20:04:52 The never ending quest for clarity on shared_buffers
Previous Message Greg Stark 2004-10-06 19:38:56 Re: Comparing user attributes with bitwise operators