Re: POC, WIP: OR-clause support for indexes

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Peter Geoghegan <pg(at)bowt(dot)ie>, "Finnerty, Jim" <jfinnert(at)amazon(dot)com>, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, teodor(at)sigaev(dot)ru, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
Subject: Re: POC, WIP: OR-clause support for indexes
Date: 2024-06-14 12:00:33
Message-ID: CAPpHfduah1PLzajBJFDmp7+MZuaWYpie2p+GsV0r03fcGghQ-g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 8, 2024 at 1:34 AM Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
>
> I've revised the patch. Did some beautification, improvements for
> documentation, commit messages etc.
>
> I've pushed the 0001 patch without 0002. I think 0001 is good by
> itself given that there is the or_to_any_transform_limit GUC option.
> The more similar OR clauses are here the more likely grouping them
> into SOAP will be a win. But I've changed the default value to 5.
> This will make it less invasive and affect only queries with obvious
> repeating patterns. That also reduced the changes in the regression
> tests expected outputs.
>
> Regarding 0002, it seems questionable since it could cause a planning
> slowdown for SAOP's with large arrays. Also, it might reduce the win
> of transformation made by 0001. So, I think we should skip it for
> now.

The patch has been reverted from pg17. Let me propose a new version
for pg18 based on the valuable feedback from Tom Lane [1][2].

* The transformation is moved to the stage of adding restrictinfos to
the base relation (in particular add_base_clause_to_rel()). This
leads to interesting consequences. While this allows IndexScans to
use transformed clauses, BitmapScans and SeqScans seem unaffected.
Therefore, I wasn't able to find a planning regression.
* As soon as there is no planning regression anymore, I've removed
or_to_any_transform_limit GUC, which was a source of critics.
* Now, not only Consts allowed in the SAOP's list, but also Params.
* The criticized hash based on expression jumbling has been removed.
Now, the plain list is used instead.
* OrClauseGroup now gets a legal node tag. That allows to mix it in
the list with other nodes without hacks.

I think this patch shouldn't be as good as before for optimizing
performance of large OR lists, given that BitmapScans and SeqScans
still deal with ORs. However, it allows IndexScans to handle more,
doesn't seem to cause planning regression and therefore introduce no
extra GUC. Overall, this seems like a good compromise.

This patch could use some polishing, but I'd like to first hear some
feedback on general design.

Links
1. https://www.postgresql.org/message-id/3604469.1712628736%40sss.pgh.pa.us
2. https://www.postgresql.org/message-id/3649287.1712642139%40sss.pgh.pa.us

------
Regards,
Alexander Korotkov
Supabase

Attachment Content-Type Size
v25-0001-Transform-OR-clauses-to-ANY-expression.patch application/octet-stream 34.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2024-06-14 12:10:01 Re: RFC: adding pytest as a supported test framework
Previous Message Alexander Lakhin 2024-06-14 12:00:00 Re: Remove dependence on integer wrapping