Re: a wrong index choose when statistics is out of date

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Andy Fan <zhihuifan1213(at)163(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: a wrong index choose when statistics is out of date
Date: 2024-03-04 23:13:38
Message-ID: CAApHDvr6Dv-oHN3qn3t9Z3QHLT0qi80fa_tRNmmjG8r9BhzoHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 5 Mar 2024 at 00:37, Andy Fan <zhihuifan1213(at)163(dot)com> wrote:
>
> David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> > I don't think it would be right to fudge the costs in any way, but I
> > think the risk factor for IndexPaths could take into account the
> > number of unmatched index clauses and increment the risk factor, or
> > "certainty_factor" as it is currently in my brain-based design. That
> > way add_path() would be more likely to prefer the index that matches
> > the most conditions.
>
> This is somehow similar with my proposal at [1]? What do you think
> about the treat 'col op const' as 'col op $1' for the marked column?
> This could just resolve a subset of questions in your mind, but the
> method looks have a solid reason.

Do you mean this?

> + /*
> + * To make the planner more robust to handle some inaccurate statistics
> + * issue, we will add a extra cost to qpquals so that the less qpquals
> + * the lower cost it has.
> + */
> + cpu_run_cost += 0.01 * list_length(qpquals);

I don't think it's a good idea to add cost penalties like you proposed
there. This is what I meant by "I don't think it would be right to
fudge the costs in any way".

If you modify the costs to add some small penalty so that the planner
is more likely to favour some other plan, what happens if we then
decide the other plan has some issue and we want to penalise that for
some other reason? Adding the 2nd penalty might result in the original
plan choice again. Which one should be penalised more? I think the
uncertainty needs to be tracked separately.

Fudging the costs like this is also unlikely to play nicely with
add_path's use of STD_FUZZ_FACTOR. There'd be an incentive to do
things like total_cost *= STD_FUZZ_FACTOR; to ensure we get a large
enough penalty.

David

> [1] https://www.postgresql.org/message-id/CAApHDvovVWCbeR4v%2BA4Dkwb%3DYS_GuJG9OyCm8jZu%2B%2BcP2xsY_A%40mail.gmail.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-03-04 23:18:29 Re: Add new error_action COPY ON_ERROR "log"
Previous Message Michael Paquier 2024-03-04 22:53:09 Re: Avoid is possible a expensive function call (src/backend/utils/adt/varlena.c)