[POC] Allow an extension to add data into Query and PlannedStmt nodes

From: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: [POC] Allow an extension to add data into Query and PlannedStmt nodes
Date: 2023-03-29 07:02:30
Message-ID: e321eec2-b91c-1378-250a-e38dcf0ed827@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Previously, we read int this mailing list some controversial opinions on
queryid generation and Jumbling technique. Here we don't intend to solve
these problems but help an extension at least don't conflict with others
on the queryId value.

Extensions could need to pass some query-related data through all stages
of the query planning and execution. As a trivial example,
pg_stat_statements uses queryid at the end of execution to save some
statistics. One more reason - extensions now conflict on queryid value
and the logic of its changing. With this patch, it can be managed.

This patch introduces the structure 'ExtensionData' which allows to
manage of a list of entries with a couple of interface functions
addExtensionDataToNode() and GetExtensionData(). Keep in mind the
possible future hiding of this structure from the public interface.
An extension should invent a symbolic key to identify its data. It may
invent as many additional keys as it wants but the best option here - is
no more than one entry for each extension.
Usage of this machinery is demonstrated by the pg_stat_statements
example - here we introduced Bigint node just for natively storing of
queryId value.

Ruthless pgbench benchmark shows that we got some overhead:
1.6% - in default mode
4% - in prepared mode
~0.1% in extended mode.

An optimization that avoids copying of queryId by storing it into the
node pointer field directly allows to keep this overhead in a range of
%0.5 for all these modes but increases complexity. So here we
demonstrate not optimized variant.

Some questions still cause doubts:
- QueryRewrite: should we copy extension fields from the parent
parsetree to the rewritten ones?
- Are we need to invent a registration procedure to do away with the
names of entries and use some compact integer IDs?
- Do we need to optimize this structure to avoid a copy for simple data
types, for example, inventing something like A_Const?

All in all, in our opinion, this issue is tend to grow with an
increasing number of extensions that utilize planner and executor hooks
for some purposes. So, any thoughts will be useful.

--
Regards
Andrey Lepikhov
Postgres Professional

Attachment Content-Type Size
v0-0001-Add-on-more-field-into-Query-and-PlannedStmt.patch text/x-patch 15.4 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2023-03-29 07:08:05 Re: TAP output format in pg_regress
Previous Message Michael Paquier 2023-03-29 06:59:20 Re: [BUG] pg_stat_statements and extended query protocol