Re: [PATCH] Optionally record Plan IDs to track plan changes for a query

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Sami Imseih <samimseih(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Lukas Fittl <lukas(at)fittl(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Marko M <marko(at)pganalyze(dot)com>
Subject: Re: [PATCH] Optionally record Plan IDs to track plan changes for a query
Date: 2025-02-12 13:46:06
Message-ID: Z6ymHoX-Y1VlBDj3@jrouhaud
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 12, 2025 at 01:57:47PM +0100, Alvaro Herrera wrote:
> On 2025-Feb-12, Julien Rouhaud wrote:
>
> > > FWIW, I think options to tweak queryId computation is something that
> > > should be in core. It was discussed earlier in the context of IN
> > > list merging; the patch for this currently has the guc for the
> > > feature in pg_stat_statements, but there was a discussion about
> > > actually moving this to core [1] Giving the user a way to control
> > > certain behavior about the queryId computation is a good thing to do
> > > in core; especially queryId is no longer just consumed in
> > > pg_stat_statements. Maybe the right answer is an enum GUC, not sure
> > > yet.
>
> > Well, the ability for extensions to override the actual queryid
> > calculation was the result of more than half a decade of strong
> > disagreements about it. And I'm definitely not volunteering to
> > reopen that topic :)
>
> Anyway, I think that's different. We do support compute_query_id=off as
> a way for a custom module to compute completely different query IDs
> using their own algorithm, which I think is what you're referring to.
> However, the ability to affect the way the in-core algorithm works is a
> different thing: you still want in-core code to compute the query ID.

I don't think that's the actual behavior, or at least not what it was supposed
to be.

What we should have is the ability to compute queryid, which can be either in
core or done by an external module, but one only one can / should be done. And
then you have stuff that use that queryid, e.g. pg_stat_statements,
pg_stat_activity and whatnot, no matter what generated it. That's per the
original commit 5fd9dfa5f50e message:

Add compute_query_id GUC to control whether a query identifier should be
computed by the core (off by default). It's thefore now possible to
disable core queryid computation and use pg_stat_statements with a
different algorithm to compute the query identifier by using a
third-party module.

To ensure that a single source of query identifier can be used and is
well defined, modules that calculate a query identifier should throw an
error if compute_query_id specified to compute a query id and if a query
idenfitier was already calculated.

> Right now, the proposal in the other thread is that if you want to
> affect that algorithm in order to merge arrays to be considered a single
> query element regardless of its length, you set the GUC for that.
> Initially the GUC was in the core code. Then, based on review, the GUC
> was moved to the extension, _BUT_ the implementation was still in the
> core code: in order to activate it, the extension calls a function that
> modifies core code behavior. So there are more moving parts than
> before, and if you for whatever reason want that behavior but not the
> extension, then you need to write a C function. To me this is absurd.
> So what I suggest we do is return to having the GUC in the core code.

I agree, although that probably breaks the queryid extensibility. I haven't
read the patch but IIUC if you want the feature to work you need to both change
the queryid calculation but also the way the constants are recorded and the
query text is normalized, and I don't know if extensions have access to it. If
they have access and fail to do what the GUC asked then of course that's just a
bug in that extension.

> Now I admit I'm not sure what the solution would be for the problem
> discussed in this subthread. Apparently the problem is related to temp
> tables and their changing OIDs. I'm not sure what exactly the proposal
> for a GUC is.

I'm not proposing anything, just explaining why pg_stat_statements is generally
useless if you use temp tables as someone asked.

> I do note that what you do in pg_queryid, which is
> simply to ignore the table altogether, is probably not a great idea.

Yeah, that's also why I said in my previous message that using its name for the
queryid would be better. Note that in pg_queryid it's already possible to use
relation names rather than oid for the queryid (which I wouldn't recommend, but
it's good for testing). I just never implemented it for a temp-only
granularity.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dagfinn Ilmari Mannsåker 2025-02-12 13:58:16 Re: TAP test command_fails versus command_fails_like
Previous Message Fujii Masao 2025-02-12 13:37:04 Re: Question about UpdateFullPageWrites() at the end of recovery.