Re: Replace IN VALUES with ANY in WHERE clauses during optimization

From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Andrei Lepikhov <lepihov(at)gmail(dot)com>, Ivan Kush <ivan(dot)kush(at)tantorlabs(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Subject: Re: Replace IN VALUES with ANY in WHERE clauses during optimization
Date: 2025-02-09 19:47:26
Message-ID: 3b679503-a7be-4e00-9a75-189174e84027@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi! Thank you for your review!

On 09.02.2025 18:38, Alexander Korotkov wrote:
> On Sun, Feb 9, 2025 at 1:58 PM Alexander Korotkov<aekorotkov(at)gmail(dot)com> wrote:
>> On Thu, Jan 9, 2025 at 3:11 PM Alena Rybakina<a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>>> On 04.10.2024 12:05, Andrei Lepikhov wrote:
>>>> We also have an implementation of VALUES -> ARRAY transformation.
>>>> Because enterprises must deal with users' problems, many of these
>>>> users employ automatically generated queries.
>>>> Being informed very well of the consensus about that stuff, we've
>>>> designed it as a library. But, looking into the code now, I see that
>>>> it only needs a few cycles if no one 'x IN VALUES' expression is
>>>> presented in the query. Who knows? It may be OK for the core.
>>>> So, I've rewritten the code into the patch - see it in the attachment.
>>>>
>>>> The idea is quite simple - at the same place as
>>>> convert_ANY_sublink_to_join, we can test the SubLink on proper VALUES
>>>> RTE and perform the transformation if it's convertible.
>>> I updated the patch due to the problem with the coercion types for both
>>> sides of the expression.
>>>
>>> We must find a common type for both leftop of the expression and rightop
>>> including constants for correct transformation, and at the same time
>>> check that the resulting types are compatible.
>>>
>>> To do this we find an operator for the two input types if it is
>>> possible, and also remember the target types for the left and right
>>> sides, and after that make a coercion.
>>>
>>> This processing is only needed in cases where we are not working with
>>> parameters since the final type is not specified for the parameters.
>> I took a look at this patch.
>>
>> + /* TODO: remember parameters */

This comment is not relevant anymore.This comment was added during patch
development and so it is enough to save const params in
arrayExpr->elements. I'll delete it.

Andrei did review of my last code and improved it. I'll add his code too.

>> What was intended to do here?
>>
>> Also, aren't we too restrictive while requiring is_simple_values_sequence()?
>> For instance, I believe cases like this (containing Var) could be transformed too.
>>
>> select * from t t1, lateral (select * from t t2 where t2.i in (values (t1.i), (1)));

I am willing to agree with you because I didn't see any limitations for
that. After analyzing diff of regression tests and your example (below),
I think I will need to add a piece of logic of preparation to pull up
the sub-select into top range table like here [0] to correct processing
vars elements based on their position in the query.

[0]
https://www.postgresql.org/message-id/975a3736-a8b5-49b3-8009-4d4e86867aa1%40postgrespro.ru

alena(at)postgres=# explain select * from t t1, lateral (select * from t t2
where t2.x in (val
ues (t1.x), (1)));
ERROR:  bogus varlevelsup: 1 offset 0

So, I'm working on it.

> Also, I think there is quite a code duplication about construction of
> SAOP between match_orclause_to_indexcol() and convert_VALUES_to_ANY()
> functions. I would like to see a refactoring as a separate first
> patch, which extracts the common part into a function.

I completely agree with you. Ill add it.

--
Regards,
Alena Rybakina
Postgres Professional

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jelte Fennema-Nio 2025-02-09 20:06:02 Re: RFC: Allow EXPLAIN to Output Page Fault Information
Previous Message Rushabh Lathia 2025-02-09 18:58:45 Re: Support NOT VALID / VALIDATE constraint options for named NOT NULL constraints