Re: LIKE op with B-Tree Index?

From: "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "Sam Wong *EXTERN*" <sam(at)hellosam(dot)net>, "Merlin Moncure" <mmoncure(at)gmail(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: LIKE op with B-Tree Index?
Date: 2012-10-18 07:03:13
Message-ID: D960CB61B694CF459DCFB4B0128514C2089027DD@exadv11.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Sam Wong wrote:
>>>>> I am investigating a performance issue involved with LIKE 'xxxx%'
>>>>> on an index in a complex query with joins.

>>>>> Q1.
>>>>> SELECT * FROM shipments WHERE shipment_id LIKE '12345678%'
>>>>>
>>>>> Q2.
>>>>> SELECT * FROM shipments WHERE shipment_id >= '12345678' AND
>>>>> shipment_id < '12345679'

[Q1 and Q2 have different row estimates]

Merlin wrote:
>> Right -- I didn't visualize it properly. Still, you're asking
>> the server to infer that
>> since you're looking between to adjacent textual characters range
bounded
>> [) it convert the 'between' to a partial
>> string search. That hold up logically but probably isn't worth
>> spending cycles to do, particularly in cases of non-ascii mappable
unicode
>> characters.

> Postgresql did that already. Refer to the analyze result of Q1 and Q2,
it
> gives
> "Index Cond: ((shipment_id >= '12345678'::text) AND (shipment_id <
> '12345679'::text))"
> (I also just realized they did it just now)
>
> Yet, with additional Filter (ref Q1 analyze), it's surprisingly that
it
> estimates Q1 will have more rows that Q2.
>
> FYI, I made a self-contained test case and submitted a bug #7610.

Did you try to increase the statistics for column "shipment_id"?

This will probably not make the difference go away, but
if the estimate gets better, it might be good enough for
the planner to pick the correct plan.

Yours,
Laurenz Albe

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Thom Brown 2012-10-18 16:11:51 Unused index influencing sequential scan plan
Previous Message Andrea Suisani 2012-10-18 06:57:12 Re: Two identical systems, radically different performance