Re: POC, WIP: OR-clause support for indexes

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, Andrei Lepikhov <lepihov(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Nikolay Shaplov <dhyan(at)nataraj(dot)su>, pgsql-hackers(at)lists(dot)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org, Peter Geoghegan <pg(at)bowt(dot)ie>, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, teodor(at)sigaev(dot)ru, Peter Eisentraut <peter(at)eisentraut(dot)org>, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
Subject: Re: POC, WIP: OR-clause support for indexes
Date: 2024-11-17 23:19:49
Message-ID: CAPpHfduqOuvh6i=jYDPWnAUg325hsOLUAW9r_awdirQRA7uzHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 15, 2024 at 3:27 PM Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
> On Mon, Oct 28, 2024 at 6:55 PM Alena Rybakina
> <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
> > I may be wrong, but the original idea was to double-check the result with the original expression.
> >
> > But I'm willing to agree with you. I think we should add transformed rinfo variable through add_predicate_to_index_quals function. I attached the diff file to the letter.
> >
> > diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
> > index 3da7ea8ed57..c68ac7008e6 100644
> > --- a/src/backend/optimizer/path/indxpath.c
> > +++ b/src/backend/optimizer/path/indxpath.c
> > @@ -3463,10 +3463,11 @@ match_orclause_to_indexcol(PlannerInfo *root,
> > * rinfo in iclause->rinfo to detect duplicates and recheck the original
> > * clause.
> > */
> > + RestrictInfo *rinfo_new = make_simple_restrictinfo(root,
> > + &saopexpr->xpr);
> > iclause = makeNode(IndexClause);
> > - iclause->rinfo = rinfo;
> > - iclause->indexquals = list_make1(make_simple_restrictinfo(root,
> > - &saopexpr->xpr));
> > + iclause->rinfo = rinfo_new;
> > + iclause->indexquals = add_predicate_to_index_quals(index, list_make1(rinfo_new));
> > iclause->lossy = false;
> > iclause->indexcol = indexcol;
> > iclause->indexcols = NIL;
>
> As I stated in [1], I don't think we should pass transformed clause to
> IndexClause.rinfo while comment explicitly says us to pass original
> rinfo there.
>
> > I figured out comments that you mentioned and found some addition explanation.
> >
> > As I understand it, this processing is related to ensuring that the selectivity of the index is assessed correctly and that there is no underestimation, which can lead to the selection of a partial index in the plan. See comment for the add_predicate_to_index_quals function:
> >
> > * ANDing the index predicate with the explicitly given indexquals produces
> > * a more accurate idea of the index's selectivity. However, we need to be
> > * careful not to insert redundant clauses, because clauselist_selectivity()
> > * is easily fooled into computing a too-low selectivity estimate. Our
> > * approach is to add only the predicate clause(s) that cannot be proven to
> > * be implied by the given indexquals. This successfully handles cases such
> > * as a qual "x = 42" used with a partial index "WHERE x >= 40 AND x < 50".
> > * There are many other cases where we won't detect redundancy, leading to a
> > * too-low selectivity estimate, which will bias the system in favor of using
> > * partial indexes where possible. That is not necessarily bad though.
> > *
> > * Note that indexQuals contains RestrictInfo nodes while the indpred
> > * does not, so the output list will be mixed. This is OK for both
> > * predicate_implied_by() and clauselist_selectivity(), but might be
> > * problematic if the result were passed to other things.
> > */
> >
> > In those comments that you mentioned, it was written that this problem of expression redundancy is checked using the predicate_implied_by function, note that it is called there.
> >
> > * In some situations (particularly with OR'd index conditions) we may * have scan_clauses that are not equal to, but are logically implied by, * the index quals; so we also try a predicate_implied_by() check to see * if we can discard quals that way. (predicate_implied_by assumes its * first input contains only immutable functions, so we have to check * that.)
>
> As the first line of header comment of add_predicate_to_index_quals()
> says it adds partial index predicate to the quals list. I don't see
> why should we use that in match_orclause_to_indexcol(), because this
> function is only responsible to matching rinfo to particular index
> column. Matching of partial index predicate is handled elsewhere.
> Also check there is get_index_clause_from_support(), which is fetch
> transformed clause from a support function. And it doesn't have to
> fiddle with add_predicate_to_index_quals().
>
> > I also figured out more information about loosy variable. First of all, I tried changing the value of the variable and did not notice any difference in regression tests. As I understood, our transformation is completely equivalent, so loosy should be true. But I don't think they are needed since our expressions are equivalent. I thought for a long time about an example where this could be a mistake and didn’t come up with any of them.
>
> Yes, our transformation isn't lossy, thus IndexClause.lossy should be unset.

Here is the next revision of this patch. No material changes,
adjustments for comments and commit message.

------
Regards,
Alexander Korotkov
Supabase

Attachment Content-Type Size
v44-0001-Transform-OR-clauses-to-SAOP-s-during-index-matc.patch application/octet-stream 43.5 KB
v44-0002-Teach-bitmap-path-generation-about-transforming-.patch application/octet-stream 38.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2024-11-18 00:01:49 Reduce TupleHashEntryData struct size by half
Previous Message Greg Sabino Mullane 2024-11-17 18:50:53 Re: optimize file transfer in pg_upgrade