Re: BUG #17871: JIT during postgresql_fdw remote_estimates EXPLAIN have very negatively effect on planning time

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, maxim(dot)boguk(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17871: JIT during postgresql_fdw remote_estimates EXPLAIN have very negatively effect on planning time
Date: 2023-03-30 19:50:05
Message-ID: 280338.1680205805@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Andres Freund <andres(at)anarazel(dot)de> writes:
> I wonder if we should address the postgres_fdw case of potentially doing lots
> of EXPLAINs by adding an option geared towards that?

> Or perhaps even a dedicated function? Generating an entire executor tree,
> just to get to estimates that the planner already determined, seems a bit
> questionable... Explain output can be quite large. And we are just looking at
> the top-level cost + rows anyway?

I experimented with the idea of adding an EXPLAIN option designed for
postgres_fdw's needs. It seems quite simple to skip executor startup
(and hence jitting) as well as most of the cost of constructing the
EXPLAIN output. You could imagine going further and having a way for
the planner to skip create_plan() and just return the Path tree, but
I'm skeptical that that would be worth the mess. It would be a lot
harder to shoehorn into the existing EXPLAIN APIs, for one thing,
and I'm not sure what the planner side would look like either.

The attached just implements the actual printout without the
infrastructure of a new EXPLAIN option, but that's enough for rough
timing tests. Using a case similar to Maxim's, with a 100-partition
table with a dropped column, this gives me something like

=# explain select * from tpart;
QUERY PLAN
------------------------------------------------
Plan (cost=0.00..4390.00 rows=226000 width=8)
(1 row)

Time: 0.795 ms

instead of:

=# explain select * from tpart;
QUERY PLAN
-------------------------------------------------------------------------------
Append (cost=0.00..4390.00 rows=226000 width=8)
-> Seq Scan on tpart1 tpart_1 (cost=0.00..32.60 rows=2260 width=8)
-> Seq Scan on tpart2 tpart_2 (cost=0.00..32.60 rows=2260 width=8)
...
-> Seq Scan on tpart100 tpart_100 (cost=0.00..32.60 rows=2260 width=8)
JIT:
Functions: 200
Options: Inlining true, Optimization true, Expressions true, Deforming true
(104 rows)

Time: 7.219 ms

Obviously plenty of bikeshedding to be done here about the option name
and the exact output, but that's the basic idea. It would take only
a couple of lines in postgres_fdw.c to invoke this option if the
remote server is new enough, and its code for parsing the EXPLAIN
output wouldn't need to change at all.

BTW, I'm still wondering how Maxim got to 100-plus-msec overheads;
it's around 10ms at most for me, with what seems like a similar case.

regards, tom lane

Attachment Content-Type Size
explain-short-cut-very-wip.patch text/x-diff 1.4 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message David Rowley 2023-03-30 23:19:08 Re: BUG #17879: ERROR: unrecognized node type: 2139062143
Previous Message Tom Lane 2023-03-30 17:33:40 Re: BUG #17877: Referencing a system column in a foreign key leads to incorrect memory access