Considering fractional paths in Append node

From: Nikita Malakhov <hukutoc(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Considering fractional paths in Append node
Date: 2024-10-16 18:24:10
Message-ID: CAN-LCVPxnWB39CUBTgOQ9O7Dd8DrA_tpT1EY3LNVnUuvAX1NjA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers!

A colleague of mine, Andrei Lepikhov, has found interesting behavior
in path cost calculation for Append node - when evaluating the cheapest
path it does not take into account fractional path costs.

We've prepared a patch that forces add_paths_to_append_rel function
to consider non-parametrized fractional path costs.

The effect is easily seen in one of standard PG tests:
Vanilla (current master):
explain analyze
select t1.unique1 from tenk1 t1
inner join tenk2 t2 on t1.tenthous = t2.tenthous and t2.thousand = 0
union all
(values(1)) limit 1;
QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..219.55 rows=1 width=4) (actual time=6.309..6.312 rows=1
loops=1)
-> Append (cost=0.00..2415.09 rows=11 width=4) (actual
time=6.308..6.310 rows=1 loops=1)
-> Nested Loop (cost=0.00..2415.03 rows=10 width=4) (actual
time=6.307..6.308 rows=1 loops=1)
Join Filter: (t1.tenthous = t2.tenthous)
Rows Removed by Join Filter: 4210
-> Seq Scan on tenk1 t1 (cost=0.00..445.00 rows=10000
width=8) (actual time=0.004..0.057 rows=422 loops=1)
-> Materialize (cost=0.00..470.05 rows=10 width=4) (actual
time=0.000..0.014 rows=10 loops=422)
Storage: Memory Maximum Storage: 17kB
-> Seq Scan on tenk2 t2 (cost=0.00..470.00 rows=10
width=4) (actual time=0.076..5.535 rows=10 loops=1)
Filter: (thousand = 0)
Rows Removed by Filter: 9990
-> Result (cost=0.00..0.01 rows=1 width=4) (never executed)
Planning Time: 0.324 ms
Execution Time: 6.336 ms
(14 rows)

Patched, the same test:
explain analyze
select t1.unique1 from tenk1 t1
inner join tenk2 t2 on t1.tenthous = t2.tenthous and t2.thousand = 0
union all
(values(1)) limit 1;
QUERY
PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.29..126.00 rows=1 width=4) (actual time=0.105..0.106 rows=1
loops=1)
-> Append (cost=0.29..1383.12 rows=11 width=4) (actual
time=0.104..0.105 rows=1 loops=1)
-> Nested Loop (cost=0.29..1383.05 rows=10 width=4) (actual
time=0.104..0.104 rows=1 loops=1)
-> Seq Scan on tenk2 t2 (cost=0.00..470.00 rows=10
width=4) (actual time=0.076..0.076 rows=1 loops=1)
Filter: (thousand = 0)
Rows Removed by Filter: 421
-> Index Scan using tenk1_thous_tenthous on tenk1 t1
(cost=0.29..91.30 rows=1 width=8) (actual time=0.026..0.026 rows=1 loops=1)
Index Cond: (tenthous = t2.tenthous)
-> Result (cost=0.00..0.01 rows=1 width=4) (never executed)
Planning Time: 0.334 ms
Execution Time: 0.130 ms
(11 rows)

Hope this optimization could be useful.

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nikita Malakhov 2024-10-16 18:26:57 Re: Considering fractional paths in Append node
Previous Message Tom Lane 2024-10-16 18:17:58 Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)