why memoize is not used for correlated subquery

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: why memoize is not used for correlated subquery
Date: 2024-05-28 07:31:03
Message-ID: CAFj8pRBRyML16xf4W-UJGg+O1OaH4WDR9BcwZhVzW0YLCD4XBA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

I am playing with examples for P2D2, and I found few issues related to
memoize

1. I use dataset https://pgsql.cz/files/obce.sql - it is data about czech
population

Dictionary - "obec" -> "village", "pocet_muzu" -> "number_of_men",
"pocet_zen" -> "number_of_woman", "okres" -> "district", "nazev" -> "name"

I wrote the query - biggest village per district

select nazev
from obce o
where pocet_muzu + pocet_zen = (select max(pocet_muzu + pocet_zen)
from obce
where o.okres_id = okres_id);

I expected usage of memoize, because in this query, it can be very
effective https://explain.depesz.com/s/0ubC

(2024-05-28 09:09:58) postgres=# explain select nazev from obce o where
pocet_muzu + pocet_zen = (select max(pocet_muzu + pocet_zen) from obce
where o.okres_id = okres_id);
QUERY PLAN

══════════════════════════════════════════════════════════════════════════════════════════════════════════
Seq Scan on obce o (cost=0.00..33588.33 rows=31 width=10)
Filter: ((pocet_muzu + pocet_zen) = (SubPlan 2))
SubPlan 2
-> Result (cost=5.34..5.35 rows=1 width=4)
InitPlan 1
-> Limit (cost=0.28..5.34 rows=1 width=4)
-> Index Scan Backward using obce_expr_idx on obce
(cost=0.28..409.92 rows=81 width=4)
Index Cond: ((pocet_muzu + pocet_zen) IS NOT NULL)
Filter: ((o.okres_id)::text = (okres_id)::text)
(9 rows)

But it doesn't do. I rewrote this query to lateral join, and memoize was
used, but the result was not good, because filter wa pushed to subquery

explain select * from obce o, lateral (select max(pocet_zen + pocet_muzu)
from obce where o.okres_id = okres_id) where pocet_zen + pocet_muzu = max;
QUERY PLAN

══════════════════════════════════════════════════════════════════════════════════════════════════════
Nested Loop (cost=12.83..19089.82 rows=31 width=45)
-> Seq Scan on obce o (cost=0.00..121.50 rows=6250 width=41)
-> Memoize (cost=12.83..12.85 rows=1 width=4)
Cache Key: (o.pocet_zen + o.pocet_muzu), o.okres_id
Cache Mode: binary
-> Subquery Scan on unnamed_subquery (cost=12.82..12.84 rows=1
width=4)
Filter: ((o.pocet_zen + o.pocet_muzu) = unnamed_subquery.max)
-> Aggregate (cost=12.82..12.83 rows=1 width=4)
-> Index Scan using obce_okres_id_idx on obce
(cost=0.28..12.41 rows=81 width=8)
Index Cond: ((okres_id)::text =
(o.okres_id)::text)
(10 rows)

and then the effect of memoize is almost negative
https://explain.depesz.com/s/TKLL

When I used optimization fence, then memoize was used effectively
https://explain.depesz.com/s/hhgi

explain select * from (select * from obce o, lateral (select max(pocet_zen
+ pocet_muzu) from obce where o.okres_id = okres_id) offset 0) where
pocet_zen + pocet_muzu = max;
QUERY PLAN

══════════════════════════════════════════════════════════════════════════════════════════════════════
Subquery Scan on unnamed_subquery (cost=12.83..1371.93 rows=31 width=45)
Filter: ((unnamed_subquery.pocet_zen + unnamed_subquery.pocet_muzu) =
unnamed_subquery.max)
-> Nested Loop (cost=12.83..1278.18 rows=6250 width=45)
-> Seq Scan on obce o (cost=0.00..121.50 rows=6250 width=41)
-> Memoize (cost=12.83..12.84 rows=1 width=4)
Cache Key: o.okres_id
Cache Mode: binary
-> Aggregate (cost=12.82..12.83 rows=1 width=4)
-> Index Scan using obce_okres_id_idx on obce
(cost=0.28..12.41 rows=81 width=8)
Index Cond: ((okres_id)::text =
(o.okres_id)::text)
(10 rows)

My question is - does memoize support subqueries? And can be enhanced to
support this exercise without LATERAL and optimization fences?

Regards

Pavel

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tender Wang 2024-05-28 07:46:21 Re: why memoize is not used for correlated subquery
Previous Message Pradeep Kumar 2024-05-28 07:10:47 Re: Need clarification on compilation errors in PG 16.2