query ID goes missing with extended query protocol

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>
Subject: query ID goes missing with extended query protocol
Date: 2024-08-28 20:27:38
Message-ID: CA+TgmoZxtnf_jZ=VqBSyaU8hfUkkwoJCJ6ufy4LGpXaunKrjrg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Suppose you set compute_query_id=on and then execute a query using
either the simple or the extended query protocol. With the simple
query protocol, the query ID is advertised during query execution;
with the extended query protocol, it's not. Here's a test case to
demonstrate:

robert.haas=# select query_id from pg_stat_activity where pid =
pg_backend_pid() \g
query_id
---------------------
4332952175652079452
(1 row)

robert.haas=# select query_id from pg_stat_activity where pid =
pg_backend_pid() \bind \g
query_id
----------

(1 row)

I found this surprising and decided to investigate why it was
happening. It turns out that exec_{parse,bind,execute}_message() all
start by calling pgstat_report_activity(STATE_RUNNING, ...) which
resets the advertised query ID, which seems fair enough. After that,
exec_parse_message() does parse analysis, which results in a call to
pgstat_report_query_id() that advertises the correct query ID. After
resetting the advertised query ID initially, exec_bind_message() calls
PortalStart() which calls ExecutorStart() which re-advertises the same
query ID that was computed at parse analysis time. At execute time, we
call PortalRun() which calls ExecutorRun() which does NOT re-advertise
the saved query ID. But as far as I can see, that's just an oversight
in commit 4f0b0966c86, which added this hunk in ExecutorStart:

+ /*
+ * In some cases (e.g. an EXECUTE statement) a query execution will skip
+ * parse analysis, which means that the queryid won't be reported. Note
+ * that it's harmless to report the queryid multiple time, as the call will
+ * be ignored if the top level queryid has already been reported.
+ */
+ pgstat_report_queryid(queryDesc->plannedstmt->queryId, false);

But I think that the entire first sentence of this comment is just
wrong. Query execution can't skip parse analysis; parse analysis has
to be performed before planning and planning has to be performed
before execution. So any time parse analysis computes the query ID, we
should have it at execution time, as long as it got saved into the
PlannedStmt (and why would we ever intentionally skip that?). The
comment is also wrong about the consequences: this not only affects
the EXECUTE statement but also an Execute protocol message, hence the
behavior demonstrated above.

I propose that we should:

- Add a call to
pgstat_report_query_id(queryDesc->plannedstmt->queryId, false) to the
top of ExecutorRun() with an appropriate comment.
- Fix the incorrect comment mentioned above.
- Back-patch to all releases containing 4f0b0966c86 i.e. v14+.

Thoughts?

--
Robert Haas
EDB: http://www.enterprisedb.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2024-08-28 20:29:43 Re: allowing extensions to control planner behavior
Previous Message Robert Haas 2024-08-28 20:12:47 Re: allowing extensions to control planner behavior