Re: explain plans for foreign servers

From: Anton Shmigirilov <a(dot)shmigirilov(at)postgrespro(dot)ru>
To: dinesh salve <cooltodinesh(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: explain plans for foreign servers
Date: 2024-11-25 16:53:36
Message-ID: 711A75D0-4805-4F5A-9401-5301FAC0EDB4@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> Hi Hackers,
>
> I am working on a feature in postgres_fdw extension to show plans used by remote postgresql servers in the output of the EXPLAIN command.
> I think this will help end users understand query execution plans used by remote servers. Sample output for table people where people_1 is local partition and people_2 is remote partition would look like -
>
> postgres:5432> explain select * from "test"."people";
> QUERY PLAN
> Append (cost=0.00..399.75 rows=2270 width=46)
> → Seq Scan on "people.people_1" people_1 (cost=0.00..21.00 rows=1100 width=46)
> → Foreign Scan on "people.people_2" people_2 (cost=100.00..367.40 rows=1170 width=46)
> Remote Plan
> Seq Scan on "people.people_2" (cost=0.00..21.00 rows=1100 width=46)
> (5 rows)
>
> I would like community inputs on below high level thoughts:
>
> 1. To enable this feature, either we can introduce a new option in EXPLAIN command e.g. (fetch_remote_plans true) or control this behaviour using a guc defined in postgres_fdw extension. I am more inclined towards guc as this feature is for extension postgres_fdw. Adding the EXPLAIN command option might force other FDW extensions to handle this.
>
> 2. For ANALYZE = false, the idea is that postgres_fdw would create a connection to a remote server, prepare SQL to send over connection and store received plans in ExplainState.
>
> 3. For ANALYZE = true, idea is that postgres_fdw would set a new guc over connection to remote server, remote server postgres_fdw would read this guc and send back used query plan as a NOTICE (similar to auto_explain extension does) with custom header which postgres_fdw extension understands. . We also have an opportunity to introduce a new message type in the protocol to send back explain plans but it might look like too much work for this feature. Open to ideas here.
>
> Dinesh Salve
> SDE(at)AWS

Hi Dinesh,

Thank you for your proposal regarding explain for foreign servers.

I have been working on a similar feature and there are several considerations to take into account.

To enable this feature it is preferable to use GUC rather than the EXPLAIN option, as it simplifies regression testing. You can simply set it to off before most tests that involve plan checking, while leaving the rest unchanged. This leads to a reduction in the size of the differences.

If it is necessary to provide only the execution plan of the foreign query (without actual timing metrics), you should send EXPLAIN with ANALYZE set to OFF, regardless of the initial ANALYZE state. This approach will prevent the re-execution of the remote query (the SQL part of the ForeignScan node), which could potentially lead to side effects. It's safe to send ANALYZE ON during remote EXPLAIN only if your remote SQL is idempotent, i.e. doesn't change anything. That way you can't sent it for *Modify nodes, but it can be applicable for certain ForeignScans, such as those involving FunctionScan. In general, it is safer to enforce ANALYZE OFF in all cases.

Also you can't expose to main EXPLAIN some metrics obtained from remote side through the "remote" explain. For example, values such as actual time, planning time, execution time, and similar metrics cannot be exposed because they relates to events that occurred during the "EXPLAIN" communication, rather than during the actual planning and execution phases. Therefore, these times would likely mislead the user. I suppose it's better to enforce EXPLAIN with TIMING OFF and SUMMARY OFF when obtaining the remote portion of EXPLAIN.

While reconstructing (deparsing) the SQL query to send as part of EXPLAIN to the remote server, you can obtain SQL statements with placeholders (i.e. $1, $2, etc) instead of actual parameter values. It's syntactically incorrect SQL, which will lead to an error on the remote side. There are two ways to avoid this. You can use GENERIC_PLAN feature (v16+), which accepts dollar-parameters here. Another option is to use params_list == NULL in the deparseSelectStmtForRel() function to substitute dummy null values for placeholders, thereby generating syntactically correct SQL. The downside of this approach is the need to perform an additional deparse stage, which can be redundant.

However looking forward a patch, it is likely that some (or all) of my thoughts may become irrelevant.

--
Best regards,
Anton Shmigirilov,
Postgres Professional

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Emanuele Musella 2024-11-25 17:35:49 Re: Parametrization minimum password lenght
Previous Message Kirill Reshke 2024-11-25 16:52:17 Re: Truncate logs by max_log_size