Re: [PATCH] Optionally record Plan IDs to track plan changes for a query

From: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Sami Imseih <samimseih(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Lukas Fittl <lukas(at)fittl(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Marko M <marko(at)pganalyze(dot)com>
Subject: Re: [PATCH] Optionally record Plan IDs to track plan changes for a query
Date: 2025-03-18 19:27:47
Message-ID: 667bcfbf-9603-4dbf-bde8-823df894a7bc@tantorlabs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 12.02.2025 19:00, Alvaro Herrera wrote:
> On 2025-Feb-12, Julien Rouhaud wrote:
>
>> On Wed, Feb 12, 2025 at 01:57:47PM +0100, Alvaro Herrera wrote:
>>> Anyway, I think that's different. We do support compute_query_id=off as
>>> a way for a custom module to compute completely different query IDs
>>> using their own algorithm, which I think is what you're referring to.
>>> However, the ability to affect the way the in-core algorithm works is a
>>> different thing: you still want in-core code to compute the query ID.
>> I don't think that's the actual behavior, or at least not what it was
>> supposed to be.
>>
>> What we should have is the ability to compute queryid, which can be
>> either in core or done by an external module, but one only one can /
>> should be done.
> Yes, that's what I tried to say, but I don't understand why you say I
> said something different.
>
>>> Right now, the proposal in the other thread is that if you want to
>>> affect that algorithm in order to merge arrays to be considered a single
>>> query element regardless of its length, you set the GUC for that.
>>> Initially the GUC was in the core code. Then, based on review, the GUC
>>> was moved to the extension, _BUT_ the implementation was still in the
>>> core code: in order to activate it, the extension calls a function that
>>> modifies core code behavior. So there are more moving parts than
>>> before, and if you for whatever reason want that behavior but not the
>>> extension, then you need to write a C function. To me this is absurd.
>>> So what I suggest we do is return to having the GUC in the core code.
>> I agree, although that probably breaks the queryid extensibility.
> It does?
>
>> I haven't read the patch but IIUC if you want the feature to work you
>> need to both change the queryid calculation but also the way the
>> constants are recorded and the query text is normalized, and I don't
>> know if extensions have access to it.
> Hmm. As for the query text: with Andrey's feature with the GUC in core,
> a query like this
> SELECT foo FROM tab WHERE col1 IN (1,2,3,4)
> will have in pg_stat_activity an identical query_id to a query like this
> SELECT foo WHERE tab WHERE col1 IN (1,2,3,4,5)
> even though the query texts differ (in the number of elements in the
> array). I don't think this is a problem. This means that the query_id
> for two different queries can be identical, but that should be no
> surprise, precisely because the GUC that controls it is documented to do
> that.
>
> If pg_stat_statements is enabled with Andrey's patch, then the same
> query_id will have a single entry (which has stats for both execution of
> those queries) with that query_id, with a normalized query text that is
> going to be different from those two above; without Andrey's feature,
> the text would be
> SELECT foo WHERE tab WHERE col1 IN ($1,$2,$3,$4);
> SELECT foo WHERE tab WHERE col1 IN ($1,$2,$3,$4,$5);
> (that is, pg_stat_statements transformed the values into placeholders,
> but using exactly the same number of items in the array as the original
> queries). With Andrey's feature, it will be
> SELECT foo WHERE tab WHERE col1 IN (...);
> that is, the query text has been modified and no longer matches exactly
> any of the queries in pg_stat_activity. But note that the query text
> already does not match what's in pg_stat_activity, even before Andrey's
> patch.
>
> I don't understand what you mean with "the way the constants are
> recorded". What constants are you talking about? pg_stat_statements
> purposefully discards any constants used in the query (obviously).
>
>> If they have access and fail to do what the GUC asked then of course
>> that's just a bug in that extension.
> I don't understand what bug are you thinking that such hypothetical
> extension would have. (pg_stat_statements does of course have access to
> the query text and to the location of all constants).
>
>>> Now I admit I'm not sure what the solution would be for the problem
>>> discussed in this subthread. Apparently the problem is related to temp
>>> tables and their changing OIDs. I'm not sure what exactly the proposal
>>> for a GUC is.
>> I'm not proposing anything, just explaining why pg_stat_statements is
>> generally useless if you use temp tables as someone asked.
> Ah, okay. Well, where you see a deficiency, I see an opportunity for
> improvement :-)
>

Hi everyone,

I support the idea of computing the planid  for temporary tables using
'pg_temp.rel_name'. Moreover, we have already started using this
approach for computing queryid [0]. It seems reasonable to apply the
same logic to the planid calculation as well.

[0]:
https://www.postgresql.org/message-id/flat/Z9mkqplmUpQ4xG52%40msg.df7cb.de#efb20f01bec32aeafd58e5d4ab0dfc16

--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2025-03-18 19:31:16 wrong error message related to unsupported feature
Previous Message Noah Misch 2025-03-18 19:17:41 Re: race condition in pg_class