Re: massive performance hit when using "Limit 1"

From: Rich Doughty <rich(at)opusvl(dot)com>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: massive performance hit when using "Limit 1"
Date: 2005-12-06 13:01:13
Message-ID: 43958B99.8090904@opusvl.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Richard Huxton wrote:
> Rich Doughty wrote:
>
>>
>> This one goes nuts and doesn't return. is there any way i can
>> force a query plan similar to the one above?
>>
>> EXPLAIN SELECT _t.* FROM
>> tokens.ta_tokens _t INNER JOIN
>> tokens.ta_tokens_stock _s ON _t.token_id = _s.token_id
>> WHERE
>> _s.retailer_id = '96599' AND
>> _t.value = '10'
>> ORDER BY
>> _t.number ASC
>> LIMIT '1';
>> QUERY PLAN
>> -------------------------------------------------------------------------------------------------------------------
>>
>> Limit (cost=0.00..14967.39 rows=1 width=27)
>> -> Nested Loop (cost=0.00..22316378.56 rows=1491 width=27)
>> -> Index Scan using ta_tokens_number_key on ta_tokens _t
>> (cost=0.00..15519868.33 rows=1488768 width=27)
>> Filter: ((value)::numeric = 10::numeric)
>> -> Index Scan using ta_tokens_stock_pkey on ta_tokens_stock
>> _s (cost=0.00..4.55 rows=1 width=4)
>> Index Cond: (("outer".token_id)::integer =
>> (_s.token_id)::integer)
>> Filter: ((retailer_id)::integer = 96599)
>
>
> I *think* what's happening here is that PG thinks it will use the index
> on _t.number (since you are going to sort by that anyway) and pretty
> soon find a row that will:
> 1. have value=10
> 2. join to a row in _s with the right retailer_id
> It turns out that isn't the case, and so the query takes forever.
> Without knowing what "value" and "number" mean it's difficult to be
> sure, but I'd guess it's the "token_id" join part that's the problem,
> since at a guess a high-numbered retailer will have tokens with
> high-numbered "retailer_id".
>
> If you'd posted EXPLAIN ANALYSE then we'd be able to see what actually
> did happen.

no chance. it takes far too long to return (days...).

> Try the same query but with a low retailer_id (100 or something) and see
> if it goes a lot quicker. If that is what the problem is, try changing
> the ORDER BY to something like "_s.retailer_id, _t.value, _t.number" and
> see if that gives the planner a nudge in the right direction.

the retailer_id would make no difference as thee are only 4000-ish rows in
ta_tokens_stock and they all (for now) have the same retailer_id.

> Failing that, a change to your indexes will almost certainly help.

i'm not sure that's the case. the exact same query, but limited to >2 rows
is fine.

I found this in the 8.0.4 relnotes. i reckon its a good guess that's what the
problem is:

* Fix mis-planning of queries with small LIMIT values due to poorly thought
out "fuzzy" cost comparison

--

- Rich Doughty

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Rich Doughty 2005-12-06 13:22:57 Re: massive performance hit when using "Limit 1"
Previous Message Markus Wollny 2005-12-06 12:48:53 Re: Unicode Corruption and upgrading to 8.0.4. to 8.1