Re: Proposal - Allow extensions to set a Plan Identifier

From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>, Andrei Lepikhov <lepihov(at)gmail(dot)com>
Cc: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Lukas Fittl <lukas(at)fittl(dot)com>
Subject: Re: Proposal - Allow extensions to set a Plan Identifier
Date: 2025-02-17 20:39:53
Message-ID: CAA5RZ0unwLhQzgfWk5Ahs5bVT4O=wsspjs9SqpPMHnThrAn6mw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 16, 2025 at 5:34 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Sat, Feb 15, 2025 at 10:29:41AM +0100, Andrei Lepikhov wrote:
> > I have already implemented it twice in different ways as a core patch.
> > In my projects, we need to track queryId and plan node ID for two reasons:
>
> Are these available in the public somewhere or is that simply what
> Sami is proposing?

Andrei, you mention "plan node ID" which if I understand correctly will be
a good thing to expose to determine which part of the plan most of the time
is spent. This is similar to an idea I raised in a different thread for
explain plan progress [1].

However, I am proposing something different, which is we track a plan_id of the
full execution plan. Some extensions may hash the text version of the plan,
others may choose to do something more elaborate such as "jumbling" a plan
tree. The point becomes is that monitoring extensions such as
pg_stat_monitor and
others can set the plan_id in core so it's available in backend status.

> > 1. Optimisational decisions made during transformation/path generation
> > stages up to the end of execution to correct them in the future.
> > 2. Cache information about the query tree/node state to use it for
> > statistical purposes.
>
> Gathering of statistical data based on a node tree is one reason,
> where it may or may not be required to walk through a path.
> Influencing the plan used with an already-generated one (where hints
> could be used) was the second one, mostly replacing a plan in the
> planner hook. Influencing the paths in a plan or a subplan didn't
> really matter much with hints to drive the paths.
>
> > In my experience, we don't need a single plan_id field; we just need an
> > 'extended list' pointer at the end of the Plan, PlannedStmt, Query, and
> > RelOptInfo structures and a hook at the end of the create_plan_recurse() to
> > allow passing some info from the path generator to the plan tree.
> > An extension may add its data to the list (we may register an extensible
> > node type to be sure we don't interfere with other extensions) and
> > manipulate it in a custom way and with custom UI.
> > Generally, it makes the optimiser internals more open to extensions.
>
> Sounds to me that this maps with the addition of a "private" area to
> some of the plan structures to allow extensions to attach some data
> that would be reused elsewhere, which is rather independent than
> what's suggested here?

+1, such a private area is different from what is being proposed.

[1] https://www.postgresql.org/message-id/CAA5RZ0uGDKWxqUCMrsWKV425T2f6mqJsXKg6chq%2BWuyCwNPUGw%40mail.gmail.com.

--
Sami

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-02-17 20:57:30 Re: BUG #18815: Logical replication worker Segmentation fault
Previous Message Jelte Fennema-Nio 2025-02-17 20:25:33 Re: Bump soft open file limit (RLIMIT_NOFILE) to hard limit on startup