Quick Links

Re: Considering fractional paths in Append node

From:	Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To:	Andy Fan <zhihuifan1213(at)163(dot)com>
Cc:	Nikita Malakhov <hukutoc(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject:	Re: Considering fractional paths in Append node
Date:	2025-03-02 08:46:09
Message-ID:	CAPpHfdvJk1=2Wagn=E4cGQVL=mSux=pgXeGcupwpYQSuLMxrhg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi, Andy!

On Fri, Oct 18, 2024 at 3:55 AM Andy Fan <zhihuifan1213(at)163(dot)com> wrote:
>
> Nikita Malakhov <hukutoc(at)gmail(dot)com> writes:
>
>
> > The effect is easily seen in one of standard PG tests:
> > Vanilla (current master):
> > explain analyze
> > select t1.unique1 from tenk1 t1
> > inner join tenk2 t2 on t1.tenthous = t2.tenthous and t2.thousand = 0
> > union all
> > (values(1)) limit 1;
> > QUERY PLAN
> >
> > ------------------------------------------------------------------------------------------------------------------------------
> >
> > Limit (cost=0.00..219.55 rows=1 width=4) (actual time=6.309..6.312 rows=1 loops=1)
> > -> Append (cost=0.00..2415.09 rows=11 width=4) (actual time=6.308..6.310 rows=1 loops=1)
> > -> Nested Loop (cost=0.00..2415.03 rows=10 width=4) (actual time=6.307..6.308 rows=1 loops=1)
> > Join Filter: (t1.tenthous = t2.tenthous)
> > Rows Removed by Join Filter: 4210
> > -> Seq Scan on tenk1 t1 (cost=0.00..445.00 rows=10000 width=8) (actual time=0.004..0.057 rows=422
> > loops=1)
> > -> Materialize (cost=0.00..470.05 rows=10 width=4) (actual time=0.000..0.014 rows=10 loops=422)
> > Storage: Memory Maximum Storage: 17kB
> > -> Seq Scan on tenk2 t2 (cost=0.00..470.00 rows=10 width=4) (actual time=0.076..5.535 rows=10
> > loops=1)
> > Filter: (thousand = 0)
> > Rows Removed by Filter: 9990
> > -> Result (cost=0.00..0.01 rows=1 width=4) (never executed)
> > Planning Time: 0.324 ms
> > Execution Time: 6.336 ms
> > (14 rows)
> >
> > Patched, the same test:
> > explain analyze
> > select t1.unique1 from tenk1 t1
> > inner join tenk2 t2 on t1.tenthous = t2.tenthous and t2.thousand = 0
> > union all
> > (values(1)) limit 1;
> > QUERY PLAN
> >
> > --------------------------------------------------------------------------------------------------------------------------------------------------
> >
> > Limit (cost=0.29..126.00 rows=1 width=4) (actual time=0.105..0.106 rows=1 loops=1)
> > -> Append (cost=0.29..1383.12 rows=11 width=4) (actual time=0.104..0.105 rows=1 loops=1)
> > -> Nested Loop (cost=0.29..1383.05 rows=10 width=4) (actual time=0.104..0.104 rows=1 loops=1)
> > -> Seq Scan on tenk2 t2 (cost=0.00..470.00 rows=10 width=4) (actual time=0.076..0.076 rows=1 loops=1)
> > Filter: (thousand = 0)
> > Rows Removed by Filter: 421
> > -> Index Scan using tenk1_thous_tenthous on tenk1 t1 (cost=0.29..91.30 rows=1 width=8) (actual
> > time=0.026..0.026 rows=1 loops=1)
> > Index Cond: (tenthous = t2.tenthous)
> > -> Result (cost=0.00..0.01 rows=1 width=4) (never executed)
> > Planning Time: 0.334 ms
> > Execution Time: 0.130 ms
> > (11 rows)
> >
>
> This is a nice result. After some more thoughts, I'm feeling the startup
> cost calculation on seq scan looks a more important one to address.
>
> Bad Plan: Append (cost=0.00..2415.09 ..) shows us the "startup cost" is 0.
> Good plan: Append (cost=0.29..1383.12 ..) show us the "startup cost" is
> 0.29.
>
> The major reason of this is we calculate the "startup cost" for
> "SeqScan" and "Index scan" with different guidances. For the "Index
> scan", the startup cost is "the cost to retrieve the first tuple",
> however for "SeqScan", it is not, as we can see the startup cost for
> query "SELECT * FROM tenk2 WHERE thousand = 0" has a 0 startup_cost.
>
> In my understading, "startup cost" means the cost to retrieve the first
> tuple *already*, but at [1], Tom said:
>
> """
> I think that things might work out better if we redefined the startup
> cost as "estimated cost to retrieve the first tuple", rather than its
> current very-squishy definition as "cost to initialize the scan".
> """
>
> The above statement makes me confused. If we take the startup cost as
> cost to retrieve cost for the first tuple, we can do the below quick hack,
>
> @@ -355,8 +355,8 @@ cost_seqscan(Path *path, PlannerInfo *root,
> }
>
> path->disabled_nodes = enable_seqscan ? 0 : 1;
> - path->startup_cost = startup_cost;
> path->total_cost = startup_cost + cpu_run_cost + disk_run_cost;
> + path->startup_cost = startup_cost + (cpu_run_cost + disk_run_cost) * (1 - path->rows / baserel->tuples);
> }

You can check I've already proposed similar change years before.
https://www.postgresql.org/message-id/CAPpHfdvfDAXPXhFQT3Ww%3DkZ4tpyAsD07_oz8-fh0%3Dp6eckEBKQ%40mail.gmail.com
It appears that according to the current model startup_cost is not the
cost of the first tuple. It should be the cost of preparation work,
while extraction of tuples should be distributed uniformly through
total_cost - startup_cost.

------
Regards,
Alexander Korotkov
Supabase

In response to

Re: Considering fractional paths in Append node at 2024-10-18 00:54:48 from Andy Fan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alexander Korotkov	2025-03-02 08:53:24	Re: Considering fractional paths in Append node
Previous Message	Alexander Lakhin	2025-03-02 08:00:00	Re: Improving tracking/processing of buildfarm test failures