Quick Links

Re: limit clause breaks query planner?

From:	"Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
To:	"David West" <david(dot)west(at)cusppoint(dot)com>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: limit clause breaks query planner?
Date:	2008-09-01 12:52:53
Message-ID:	162867790809010552o624a347aif0eb5fc71e7bbad@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Hello

you should partial index

create index foo(b) on mytable where a is null;

regards
Pavel Stehule

2008/9/1 David West <david(dot)west(at)cusppoint(dot)com>:
> Hi,
>
>
>
> I have a single table with about 10 million rows, and two indexes. Index A
> is on a column A with 95% null values. Index B is on a column B with about
> 10 values, ie. About a million rows of each value.
>
>
>
> When I do a simple query on the table (no joins) with the following
> condition:
>
> A is null AND
>
> B = '21'
>
>
>
> it uses the correct index, index B. However, when I add a limit clause of
> 15, postgres decides to do a sequential scan :s. Looking at the results
> from explain:
>
>
>
> "Limit (cost=0.00..3.69 rows=15 width=128)"
>
> " -> Seq Scan on my_table this_ (cost=0.00..252424.24 rows=1025157
> width=128)"
>
> " Filter: ((A IS NULL) AND ((B)::text = '21'::text))"
>
>
>
> It appears that postgres is (very incorrectly) assuming that it will only
> have to retrieve 15 rows on a sequential scan, and gives a total cost of
> 3.69. In reality, it has to scan many rows forward until it finds the
> correct value, yielding very poor performance for my table.
>
>
>
> If I disable sequential scan (set enable_seqscan=false) it then incorrectly
> uses the index A that has 95% null values: it seems to incorrectly apply the
> same logic again that it will only have to retrieve 15 rows with the limit
> clause, and thinks that the index scan using A is faster than index scan B.
>
>
>
> Only by deleting the index on A and disabling sequential scan will it use
> the correct index, which is of course by far the fastest.
>
>
>
> Is there an assumption in the planner that a limit of 15 will mean that
> postgres will only have to read 15 rows? If so is this a bad assumption?
> If a particular query is faster without a limit, then surely it will also be
> faster with the limit.
>
>
>
> Any workarounds for this?
>
>
>
> Thanks
>
> David

Responses

Re: limit clause breaks query planner? at 2008-09-01 14:44:21 from David West
Re: limit clause breaks query planner? at 2008-09-01 19:17:17 from Pavel Stehule

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Fernando Hevia	2008-09-01 13:24:46	Re: Best hardware/cost tradoff?
Previous Message	David West	2008-09-01 12:18:33	limit clause breaks query planner?