Re: Replace IN VALUES with ANY in WHERE clauses during optimization

From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Andrei Lepikhov <lepihov(at)gmail(dot)com>, Ivan Kush <ivan(dot)kush(at)tantorlabs(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Subject: Re: Replace IN VALUES with ANY in WHERE clauses during optimization
Date: 2025-03-29 19:03:00
Message-ID: dc727ff3-10d8-4794-a60b-168c428d3b15@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

On 29.03.2025 14:03, Alexander Korotkov wrote:
> On Wed, Mar 12, 2025 at 8:11 PM Alena Rybakina
> <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>> On 06.03.2025 11:23, Alexander Korotkov wrote:
>>
>> Hi, Alena!
>>
>> On Sat, Mar 1, 2025 at 1:39 PM Alena Rybakina<a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>>
>> On 09.02.2025 18:38, Alexander Korotkov wrote:
>>
>> Also, aren't we too restrictive while requiring is_simple_values_sequence()?
>> For instance, I believe cases like this (containing Var) could be transformed too.
>>
>> select * from t t1, lateral (select * from t t2 where t2.i in (values (t1.i), (1)));
>>
>> I added it and attached a patch with diff file. To be honest, I didn't find queries except for var with volatile functions where the transform can't be applied.
>>
>> I removed the function volatility check that I added in the previous version, since we already check it in is_simple_values_sequence.
>>
>> I'm not sure about only cases where var can refer to something outside available_rels list but I couldn't come up with an example where that's possible, what do you think?
>>
>> Considering it again, I think we can't face problems like that because we don't work with join.
>>
>> I attached a diff file as a difference with the 3rd version of the patch, when we did not consider the values with var for transformation.
>>
>> I take detailed look at makeSAOPArrayExpr() function, which is much
>> more complex than corresponding fragment from
>> match_orclause_to_indexcol(). And I found it to be mostly wrong. We
>> are working in post parse-analyze stage. That means it's too late to
>> do type coercion or lookup operator by name. We have already all the
>> catalog objects nailed down. In connection with that, second argument
>> of OpExpr shouldn't be ignored as it might contain amrelevant type
>> cast. I think I've fixed the most of them problems in the attached
>> patchset.
>>
>>
>> I agree with your conclusion and changes.
> I've revised the patchset. Mostly comments/commit messages and minor
> refactoring. Also, I've removed our subquery check
> completely. Not sure if we need it at all. I'll further analyze
> that.

I agree with your changes, the code really started to look better and
more understandable. Thank you!

As for function - it checked that values didn't contain any subquery
elements (if they consists RangeTblEntry type variables) and when you
removed it you caused the problem with sublevel parameter.

> One thing I have to fix: we must do
> IncrementVarSublevelsUp() unconditionally for all expressions as Vars
> could be deeper inside.

Yes, I'm looking at it too, I've just understood that it was needed for
subqueries - they can contain var elements which needs decrease the
sublevel parameter.

for example for the query:

EXPLAIN (COSTS OFF)
SELECT ten FROM onek t
WHERE unique1 IN (VALUES (0), ((2 IN (SELECT unique2 FROM onek c
  WHERE c.unique2 = t.unique1))::integer));

We are interested in this element: ((2 IN (SELECT unique2 FROM onek c 
WHERE c.unique2 = *t.unique1*))

It is funcexpr object with RabgeTblEntry variable. I highlighted

WARNING:  1{FUNCEXPR :funcid 2558 :funcresulttype 23 :funcretset false
:funcvariadic false :funcformat 1 :funccollid 0 :inputcollid 0 :args
({SUBLINK :subLinkType 2 :subLinkId 0 :testexpr {OPEXPR :opno 96
:opfuncid 65 :opresulttype 16 :opretset false :opcollid 0 :inputcollid 0
:args ({CONST :consttype 23 :consttypmod -1 :constcollid 0 :constlen 4
:constbyval true :constisnull false :location -1 :constvalue 4 [ 2 0 0 0
0 0 0 0 ]} {PARAM :paramkind 2 :paramid 1 :paramtype 23 :paramtypmod -1
:paramcollid 0 :location -1}) :location -1} :operName ("=") :subselect
{QUERY :commandType 1 :querySource 0 :canSetTag true :utilityStmt <>
:resultRelation 0 :hasAggs false :hasWindowFuncs false :hasTargetSRFs
false :hasSubLinks false :hasDistinctOn false :hasRecursive false
:hasModifyingCTE false :hasForUpdate false :hasRowSecurity false
:hasGroupRTE false :isReturn false :cteList <> :rtable ({RANGETBLENTRY
:alias {ALIAS :aliasname c :colnames <>} :eref {ALIAS :aliasname c
:colnames ("unique1" "unique2" "two" "four" "ten" "twenty" "hundred"
"thousand" "twothousand" "fivethous" "tenthous" "odd" "even" "stringu1"
"stringu2" "string4")} :rtekind 0 :relid 32795 :inh true :relkind r
:rellockmode 1 :perminfoindex 1 :tablesample <> :lateral false :inFromCl
true :securityQuals <>}) :rteperminfos ({RTEPERMISSIONINFO :relid 32795
:inh true :requiredPerms 2 :checkAsUser 0 :selectedCols (b 9)
:insertedCols (b) :updatedCols (b)}) :jointree {FROMEXPR :fromlist
({RANGETBLREF :rtindex 1}) :quals {OPEXPR :opno 96 :opfuncid 65
:opresulttype 16 :opretset false :opcollid 0 :inputcollid 0 :args ({VAR
:varno 1 :varattno 2 :vartype 23 :vartypmod -1 :varcollid 0
:varnullingrels (b) :varlevelsup 0 :varreturningtype 0 :varnosyn 1
:varattnosyn 2 :location -1} *{VAR :varno 1 :varattno 1 :vartype 23
:vartypmod -1 :varcollid 0 :varnullingrels (b) :varlevelsup 2
:varreturningtype 0 :varnosyn 1 :varattnosyn 1 :location -1}) :location
-1}*} :mergeActionList <> :mergeTargetRelation 0 :mergeJoinCondition <>
:targetList ({TARGETENTRY :expr {VAR :varno 1 :varattno 2 :vartype 23
:vartypmod -1 :varcollid 0 :varnullingrels (b) :varlevelsup 0
:varreturningtype 0 :varnosyn 1 :varattnosyn 2 :location -1} :resno 1
:resname unique2 :ressortgroupref 0 :resorigtbl 32795 :resorigcol 2
:resjunk false}) :override 0 :onConflict <> :returningOldAlias <>
:returningNewAlias <> :returningList <> :groupClause <> :groupDistinct
false :groupingSets <> :havingQual <> :windowClause <> :distinctClause
<> :sortClause <> :limitOffset <> :limitCount <> :limitOption 0
:rowMarks <> :setOperations <> :constraintDeps <> :withCheckOptions <>
:stmt_location -1 :stmt_len -1} :location -1}) :location -1}

I highlighted in bold the var we need - since it is in a subquery in the
in expression will be flattened, all elements contained in it should
decrease the level number by one, since they will belong to the subtree
located above it. Because of that condition, this did not happen.

I generally agree with you that it is better to remove that condition.
The function IncrementVarSublevelsUp essentially goes through the
structures below and will decrease the level of only the vars for which
this needs to be done, and the condition with 1 will protect us from
touching those vars that should not. So the varlevelsup for this var
should be 1.

I am currently investigating whether this transformation will be fair
for all cases; I have not found any problems yet.

--
Regards,
Alena Rybakina
Postgres Professional

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kirill Reshke 2025-03-29 19:27:22 Re: in BeginCopyTo make materialized view using COPY TO instead of COPY (query).
Previous Message David G. Johnston 2025-03-29 18:58:22 Re: in BeginCopyTo make materialized view using COPY TO instead of COPY (query).