Re: query_id, pg_stat_activity, extended query protocol

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com>
Cc: "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>, Andrei Lepikhov <lepihov(at)gmail(dot)com>, kaido vaikla <kaido(dot)vaikla(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: query_id, pg_stat_activity, extended query protocol
Date: 2024-07-18 08:56:09
Message-ID: ZpjYqSpvYL6AIlfE@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 17, 2024 at 11:32:49AM +0200, Anthonin Bonnefoy wrote:
> Wouldn't it be enough to call pgstat_report_query_id in ExecutorRun
> and ProcessUtility? With those changes [1], both normal statements and
> utility statements called through extended protocol will correctly
> report the query_id.

Interesting, and this position is really tempting. By doing so you
would force the query ID to be set to the one from the CTAS and
EXPLAIN, because these would be executed before the inner queries, and
pgstat_report_query_id() with its non-force option does not overwrite
what would be already set (aka what should be the top-level query ID).

Using ExecutorRun() feels consistent with the closest thing I've
touched in this area lately in 1d477a907e63, because that's the only
code path that we are sure to take depending on the portal execution
(two execution scenarios depending on how rows are retrieved, as far
as I recall). The comment should be perhaps more consistent with the
executor start counterpart. So I would be OK with that.. The
location before the hook of ProcessUtility is tempting, as it would
take care of the case of PortalRunMulti(). However.. Joining with a
point from Sami upthread..

This is still not enough in the case of where we have a holdStore, no?
This is the case where we would do *one* ExecutorRun(), followed up by
a scan of the tuplestore in more than one execute message. The v2
proposed upthread, by positioning a query ID to be set in
PortalRunSelect(), is still positioning that in two places.

Hmm... How about being much more aggressive and just do the whole
business in exec_execute_message(), just before we do the PortalRun()?
I mean, that's the source of all our problems, and we know the
statements that the portal will work on so we could go through the
list, grab the first planned query and set the query ID based on that,
without caring about the portal patterns we would need to think about.

> [1] https://github.com/bonnefoa/postgres/commit/bf4b332d7b481549c6d9cfa70db51e39a305b9b2

Or use the following to download the patch, that I am attaching here:
https://github.com/bonnefoa/postgres/commit/bf4b332d7b481549c6d9cfa70db51e39a305b9b2.patch

Please attach things to your emails, if your repository disappears for
a reason or another we would lose knowledge in the archives of the
community lists.
--
Michael

Attachment Content-Type Size
queryid.patch text/x-diff 2.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2024-07-18 09:23:58 Re: Add mention of execution time memory for enable_partitionwise_* GUCs
Previous Message Peter Smith 2024-07-18 08:24:39 Re: Pgoutput not capturing the generated columns