Quick Links

Re: BUG #17871: JIT during postgresql_fdw remote_estimates EXPLAIN have very negatively effect on planning time

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	David Rowley <dgrowleyml(at)gmail(dot)com>, maxim(dot)boguk(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject:	Re: BUG #17871: JIT during postgresql_fdw remote_estimates EXPLAIN have very negatively effect on planning time
Date:	2023-03-30 19:50:05
Message-ID:	280338.1680205805@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

Andres Freund <andres(at)anarazel(dot)de> writes:
> I wonder if we should address the postgres_fdw case of potentially doing lots
> of EXPLAINs by adding an option geared towards that?

> Or perhaps even a dedicated function? Generating an entire executor tree,
> just to get to estimates that the planner already determined, seems a bit
> questionable... Explain output can be quite large. And we are just looking at
> the top-level cost + rows anyway?

I experimented with the idea of adding an EXPLAIN option designed for
postgres_fdw's needs. It seems quite simple to skip executor startup
(and hence jitting) as well as most of the cost of constructing the
EXPLAIN output. You could imagine going further and having a way for
the planner to skip create_plan() and just return the Path tree, but
I'm skeptical that that would be worth the mess. It would be a lot
harder to shoehorn into the existing EXPLAIN APIs, for one thing,
and I'm not sure what the planner side would look like either.

The attached just implements the actual printout without the
infrastructure of a new EXPLAIN option, but that's enough for rough
timing tests. Using a case similar to Maxim's, with a 100-partition
table with a dropped column, this gives me something like

=# explain select * from tpart;
QUERY PLAN
------------------------------------------------
Plan (cost=0.00..4390.00 rows=226000 width=8)
(1 row)

Time: 0.795 ms

instead of:

=# explain select * from tpart;
QUERY PLAN
-------------------------------------------------------------------------------
Append (cost=0.00..4390.00 rows=226000 width=8)
-> Seq Scan on tpart1 tpart_1 (cost=0.00..32.60 rows=2260 width=8)
-> Seq Scan on tpart2 tpart_2 (cost=0.00..32.60 rows=2260 width=8)
...
-> Seq Scan on tpart100 tpart_100 (cost=0.00..32.60 rows=2260 width=8)
JIT:
Functions: 200
Options: Inlining true, Optimization true, Expressions true, Deforming true
(104 rows)

Time: 7.219 ms

Obviously plenty of bikeshedding to be done here about the option name
and the exact output, but that's the basic idea. It would take only
a couple of lines in postgres_fdw.c to invoke this option if the
remote server is new enough, and its code for parsing the EXPLAIN
output wouldn't need to change at all.

BTW, I'm still wondering how Maxim got to 100-plus-msec overheads;
it's around 10ms at most for me, with what seems like a similar case.

regards, tom lane

Attachment	Content-Type	Size
explain-short-cut-very-wip.patch	text/x-diff	1.4 KB

In response to

Re: BUG #17871: JIT during postgresql_fdw remote_estimates EXPLAIN have very negatively effect on planning time at 2023-03-28 18:38:43 from Andres Freund

Responses

Re: BUG #17871: JIT during postgresql_fdw remote_estimates EXPLAIN have very negatively effect on planning time at 2023-03-31 12:40:59 from Maxim Boguk

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	David Rowley	2023-03-30 23:19:08	Re: BUG #17879: ERROR: unrecognized node type: 2139062143
Previous Message	Tom Lane	2023-03-30 17:33:40	Re: BUG #17877: Referencing a system column in a foreign key leads to incorrect memory access