Re: Make the qual cost on index Filter slightly higher than qual cost on index Cond.

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Make the qual cost on index Filter slightly higher than qual cost on index Cond.
Date: 2020-05-29 13:37:03
Message-ID: CAExHW5sYEcVoqGnCMgQEA9a-BOxn83GeiKvExZRSJSidQ9NU+g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 29, 2020 at 6:40 AM Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
>
>
>>
>> >so we need to optimize the cost model for such case, the method is the
>> >patch I mentioned above.
>>
>> Making the planner more robust w.r.t. to estimation errors is nice, but
>> I wouldn't go as far saying we should optimize for such cases. The stats
>> can be arbitrarily off, so should we expect the error to be 10%, 100% or
>> 1000000%?
>
>
> I don't think my patch relay on anything like that. My patch doesn't fix the
> statistics issue, just adding the extra cost on qual cost on Index Filter part.
> Assume the query pattern are where col1= X and col2 = Y. The impacts are :
> 1). Make the cost of (col1, other_column) is higher than (col1, col2)
> 2). The relationship between seqscan and index scan on index (col1, other_column)
> is changed, (this is something I don't want). However my cost difference between
> index scan & seq scan usually very huge, so the change above should has
> nearly no impact on that choice. 3). Make the cost higher index scan for
> Index (col1) only. Overall I think nothing will make thing worse.

When the statistics is almost correct (or better than what you have in
your example), the index which does not cover all the columns in all
the conditions will be expensive anyways because of extra cost to
access heap for the extra rows not filtered by that index. An index
covering all the conditions would have its scan cost cheaper since
there will be fewer rows and hence fewer heap page accesses because of
more filtering. So I don't think we need any change in the current
costing model.

>
>>
>> We'd probably end up with plans that handle worst cases well,
>> but the average performance would end up being way worse :-(
>>
>
> That's possible, that's why I hope to get some feedback on that. Actually I
> can't think out such case. can you have anything like that in mind?
>
> ----
> I'm feeling that (qpqual_cost.per_tuple * 1.001) is not good enough since user
> may have some where expensive_func(col1) = X. we may change it
> cpu_tuple_cost + qpqual_cost.per_tuple + (0.0001) * list_lenght(qpquals).
>
> --
> Best Regards
> Andy Fan

--
Best Wishes,
Ashutosh Bapat

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2020-05-29 13:52:01 Re: Speeding up parts of the planner using a binary search tree structure for nodes
Previous Message Tom Lane 2020-05-29 13:34:52 Re: password_encryption default