From: | Pierre-Frédéric Caillaud <lists(at)boutiquenumerique(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: sequential scan on select distinct |
Date: | 2004-10-06 17:34:22 |
Message-ID: | opsfglrkvmcq72hf@musicbox |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
There are even three questions here :
- given that 'SELECT DISTINCT field FROM table' is exactly
the same as 'SELECT field FROM table GROUP BY field", postgres could
transform the first into the second and avoid itself a (potentially
killer) sort.
On my example the table was not too large but on a very large table,
sorting all the values and then discinct'ing them does not look too
appealing.
Currently Postgres does Sort+Unique, but there could be a DistinctSort
instead of a Sort, that is a thing that sorts and removes the duplicates
at the same time. Not that much complicated to code than a sort, and much
faster in this case.
Or there could be a DistinctHash, which would be similar or rather
identical to a HashAggregate and would again skip the sort.
It would (as a bonus) speed up queries like UNION (not ALL), that kind of
things. For example :
explain (select number from dummy) union (select number from dummy);
Unique (cost=287087.62..297087.62 rows=2000000 width=4)
-> Sort (cost=287087.62..292087.62 rows=2000000 width=4)
Sort Key: number
-> Append (cost=0.00..49804.00 rows=2000000 width=4)
-> Subquery Scan "*SELECT* 1" (cost=0.00..24902.00
rows=1000000 width=4)
-> Seq Scan on dummy (cost=0.00..14902.00
rows=1000000 width=4)
-> Subquery Scan "*SELECT* 2" (cost=0.00..24902.00
rows=1000000 width=4)
-> Seq Scan on dummy (cost=0.00..14902.00
rows=1000000 width=4)
This is scary !
I can rewrite it as such (and the planner could, too) :
explain select * from ((select number from dummy) union all (select number
from dummy)) as foo group by number;
HashAggregate (cost=74804.00..74804.00 rows=200 width=4)
-> Subquery Scan foo (cost=0.00..69804.00 rows=2000000 width=4)
-> Append (cost=0.00..49804.00 rows=2000000 width=4)
-> Subquery Scan "*SELECT* 1" (cost=0.00..24902.00
rows=1000000 width=4)
-> Seq Scan on dummy (cost=0.00..14902.00
rows=1000000 width=4)
-> Subquery Scan "*SELECT* 2" (cost=0.00..24902.00
rows=1000000 width=4)
-> Seq Scan on dummy (cost=0.00..14902.00
rows=1000000 width=4)
which avoids a large sort...
However there must be cases in which performing a sort is faster, like
when there are a lot of distinct values and the HashAggregate becomes huge
too.
> Well there are two questions here. Why given the current plans available
> does
> postgres choose a sequential scan instead of an index scan. And why isn't
Well because it needs to get all the rows in the table in order.
in this case seq scan+sort is about twice as fast as index scan.
Interestingly, once I ANALYZED the table, postgres will chooses to
index-scan, which is slower.
> there this kind of "skip index scan" available.
It would be really nice to have a skip index scan available.
I have an other idea, lets call it the indexed sequential scan :
When pg knows there are a lot of rows to access, it will ignore the index
and seqscan. This is because index access is very random, thus slow.
However postgres could implement an "indexed sequential scan" where :
- the page numbers for the matching rows are looked up in the index
(this is fast as an index has good locality)
- the page numbers are grouped so we have a list of pages with one and
only one instance of each page number
- the list is then sorted so we have page numbers in-order
- the pages are loaded in sorted order (doing a kind of partial
sequential scan) which would be faster than reading them randomly.
Other ideas later
> Postgres chooses a sequential scan with a sort (or hash aggregate) over
> an
> index scan because it expects it to be faster. sequential scans are much
> faster than random access scans of indexes, plus index scans need to
> read many
> more blocks. If you're finding the index scan to be just as fast as
> sequential
> scans you might consider lowering random_page_cost closer to 1.0. But
> note
> that you may be getting fooled by a testing methodology where more
> things are
> cached than would be in production.
>
> why isn't a "skip index scan" plan available? Well, nobody's written the
> code
> yet. It would part of the same code needed to get an index scan used for:
>
> select y,min(x) from bar group by y
>
> And possibly also related to the TODO item:
>
> Use index to restrict rows returned by multi-key index when used with
> non-consecutive keys to reduce heap accesses
>
> For an index on col1,col2,col3, and a WHERE clause of col1 = 5 and
> col3 =
> 9, spin though the index checking for col1 and col3 matches, rather
> than
> just col1
>
>
> Note that the optimizer would have to make a judgement call based on the
> expected number of distinct values. If you had much more than 256
> distinct
> values then the your plpgsql function wouldn't have performed well at
> all.
>
From | Date | Subject | |
---|---|---|---|
Next Message | Patrick Clery | 2004-10-06 18:55:02 | Re: Comparing user attributes with bitwise operators |
Previous Message | SZUCS Gábor | 2004-10-06 17:28:45 | Re: Excessive context switching on SMP Xeons |