From: | Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Peter Petrov <p(dot)petrov(at)postgrespro(dot)ru>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com> |
Cc: | Ranier Vilela <ranier(dot)vf(at)gmail(dot)com> |
Subject: | Re: pull-up subquery if JOIN-ON contains refs to upper-query |
Date: | 2025-04-02 16:39:58 |
Message-ID: | e172de23-7acb-4330-a40a-58ead7a47944@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi!
My colleague reviewed my patch and gave feedback on how to improve it -
for some queries with data types that I did not consider, pull-up is not
applied, although it should. Some of them:
EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF, BUFFERS OFF)
SELECT 1
FROM ta
WHERE EXISTS (SELECT 1
FROM tb
JOIN tc
ON ta.id = tb.id
AND tb.id = ANY('{1}'::int[])
);
QUERY PLAN
--------------------------------------------------------------------------
Seq Scan on ta (actual rows=1.00 loops=1)
Filter: EXISTS(SubPlan 1)
Rows Removed by Filter: 1
SubPlan 1
-> Nested Loop (actual rows=0.50 loops=2)
-> Seq Scan on tb (actual rows=0.50 loops=2)
Filter: ((id = ANY ('{1}'::integer[])) AND (ta.id = id))
Rows Removed by Filter: 2
-> Seq Scan on tc (actual rows=1.00 loops=1)
EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF, BUFFERS OFF)
SELECT 1
FROM ta
WHERE EXISTS (SELECT 1
FROM tb
JOIN tc
ON ta.id = tb.id
AND tb.is_active
);
QUERY PLAN
---------------------------------------------------------
Seq Scan on ta (actual rows=2.00 loops=1)
Filter: EXISTS(SubPlan 1)
SubPlan 1
-> Nested Loop (actual rows=1.00 loops=2)
-> Seq Scan on tb (actual rows=1.00 loops=2)
Filter: (is_active AND (ta.id = id))
Rows Removed by Filter: 0
-> Seq Scan on tc (actual rows=1.00 loops=2)
EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF, BUFFERS OFF)
SELECT 1
FROM ta
WHERE EXISTS (SELECT 1
FROM tb
JOIN tc
ON ta.id = tb.id
AND tb.is_active IS NOT NULL
);
QUERY PLAN
--------------------------------------------------------------------
Seq Scan on ta (actual rows=2.00 loops=1)
Filter: EXISTS(SubPlan 1)
SubPlan 1
-> Nested Loop (actual rows=1.00 loops=2)
-> Seq Scan on tb (actual rows=1.00 loops=2)
Filter: ((is_active IS NOT NULL) AND (ta.id = id))
Rows Removed by Filter: 0
-> Seq Scan on tc (actual rows=1.00 loops=2)
UPDATE tb SET is_active = NULL WHERE id = 2;
EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF, BUFFERS OFF)
SELECT 1
FROM ta
WHERE EXISTS (SELECT 1
FROM tb
JOIN tc
ON ta.id = tb.id
AND tb.is_active IS NULL
);
QUERY PLAN
----------------------------------------------------------------
Seq Scan on ta (actual rows=1.00 loops=1)
Filter: EXISTS(SubPlan 1)
Rows Removed by Filter: 1
SubPlan 1
-> Nested Loop (actual rows=0.50 loops=2)
-> Seq Scan on tb (actual rows=0.50 loops=2)
Filter: ((is_active IS NULL) AND (ta.id = id))
Rows Removed by Filter: 4
-> Seq Scan on tc (actual rows=1.00 loops=1)
I see that I need to add a walker that, when traversing the tree,
determines whether there are conditions under which pull-up is
impossible - the presence of
volatility of functions and other restrictions, and leave the
transformation for the var objects that I added before, I described it
here.
Unfortunately, I need a few days to implement this and need time for a
review, and I think I will not have time to do this before the code
freeze, so
I am moving this to the next commitfest and not changing the status
"awaiting the author".
On 11.02.2025 18:59, Alena Rybakina wrote:
> On 10.02.2025 23:51, Ilia Evdokimov wrote:
>>
>> On 09.02.2025 18:14, Alena Rybakina wrote:
>>> Hi! I found another example where the transformation worked
>>> incorrectly and reconsidered the idea.
>>>
>>> As for conversion of exists_sublink_to_ANY, we need to get the
>>> flattened implicit-AND list of clauses and pull out the chunks of
>>> the WHERE clause that belong to the parent query,
>>> since we are called halfway through the parent's
>>> preprocess_expression() and earlier steps of preprocess_expression()
>>> wouldn't get applied to the pulled-up stuff unless we do them here.
>>> We also do some processing for vars depending on which side the var
>>> is on - if it's in a subquery, we only need to lower its level
>>> (varlevel) because subquery will be flatted, while
>>> for other vars that belong to the parent query, we need to do
>>> preparation to pull up the sub-select into top range table.
>>>
>>> For those expressions that we couldn't assign to either list, we
>>> define newWhere and apply both cases.
>>>
>>
>> When I run 'make -C contrib/ check', tests of postgres_fdw extension
>> failed. I might be wrong, but you should be careful with LIMIT.
>>
> Thank you for the review, I'm working on it.
>
Sorry for not responding, but I will fix this bug after I update the
code based on the comments above. Thank you for noticing and writing to
me, your feedback is very important.
--
Regards,
Alena Rybakina
Postgres Professional
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2025-04-02 16:40:54 | Re: Draft for basic NUMA observability |
Previous Message | Peter Geoghegan | 2025-04-02 16:36:10 | Re: BTScanOpaqueData size slows down tests |