Re: Query plan choice issue

From: Yaroslav Tykhiy <yar(at)barnet(dot)com(dot)au>
To: Yaroslav Tykhiy <yar(at)barnet(dot)com(dot)au>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-general(at)postgresql(dot)org
Subject: Re: Query plan choice issue
Date: 2010-09-20 08:08:38
Message-ID: 3874A3D4-F5B7-4BF8-96A4-753B1D1719A8@barnet.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 14/09/2010, at 10:37 AM, Yaroslav Tykhiy wrote:
>
> On 14/09/2010, at 12:41 AM, Tom Lane wrote:
>
>> Yaroslav Tykhiy <yar(at)barnet(dot)com(dot)au> writes:
[...]
>>
>> I think the major problem you're having is that the planner is
>> completely clueless about the selectivity of the condition
>> "substring"(v.headervalue, 0, 255) ~~* '%<(dot)(dot)(dot)(at)mail(dot)gmail(dot)com>%'
>> If it knew that that would match only one row, instead of several
>> thousand, it would likely pick a different plan.
>>
>> In recent versions of PG you could probably make a noticeable
>> improvement in this if you just dropped the substring() restriction
>> ... do you actually need that? Alternatively, if you don't want to
>> change the query logic at all, I'd try making an index on
>> substring(v.headervalue, 0, 255). I'm not expecting the query
>> to actually *use* the index, mind you. But its existence will prompt
>> ANALYZE to collect stats on the expression's value, and that will
>> help the planner with estimating the ~~* condition.
>
> Well, that substring() and ILIKE combo looked suspicious to me,
> too. However, there already was an index on
> substring(v.headervalue, 0, 255) but the fast query plan didn't seem
> to use it, it used a different index instead:
>
[...]
> Meanwhile, a mate of mine lurking on this list pointed out that
> reducing random_page_cost might help here and it did:
> random_page_cost of 2 made the fast query favourable.
>
> Can it mean that the default planner configuration slightly
> overfavours seq scans?
>

Funnily, after a few days of running with random_page_cost=2, exactly
the same query became slow again and I had to reduce random_page_cost
further to 1.5 to make it fast. Can it be a sign of a problem in the
planner?

Thanks!

Yar

In response to

Browse pgsql-general by date

  From Date Subject
Next Message John R Pierce 2010-09-20 08:12:38 Re: space taken by NULL values in array
Previous Message Leonardo Francalanci 2010-09-20 07:58:45 space taken by NULL values in array