Quick Links

Re: Make query ID more portable

From:	Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To:	Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>, Jim Finnerty <jfinnert(at)amazon(dot)com>
Subject:	Re: Make query ID more portable
Date:	2021-10-12 10:00:54
Message-ID:	a45b1ef2-059b-f4a2-c99e-6b44ac13b066@postgrespro.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 12/10/21 13:35, Julien Rouhaud wrote:
> Hi,
>
> On Tue, Oct 12, 2021 at 4:12 PM Andrey V. Lepikhov
> <a(dot)lepikhov(at)postgrespro(dot)ru> wrote:
>> See the patch in attachment as an POC. Main idea here is to break
>> JumbleState down to a 'clocations' part that can be really interested in
>> a post parse hook and a 'context data', that needed to build query or
>> subquery signature (hash) and, I guess, isn't really needed in any
>> extensions.
>
> There have been quite a lot of threads about that in the past, and
> almost every time people wanted to change how the hash was computed.
> So it seems to me that extensions would actually be quite interested
> in that. This is even more the case now that an extension can be used
> to replace the queryid calculation only and keep the rest of the
> extension relying on it as is.
Yes, I know. I have been using such self-made queryID for four years.
And I will use it further.
But core jumbling code is good, fast and much easier in support. The
purpose of this work is extending of jumbling to use in more flexible
way to avoid meaningless copying of this code to an extension.
>> I think, it adds not much complexity and overhead.
>
> I think the biggest change in your patch is:
>
> case RTE_RELATION:
> - APP_JUMB(rte->relid);
> - JumbleExpr(jstate, (Node *) rte->tablesample);
> + {
> + char *relname = regclassout_ext(rte->relid, true);
> +
> + APP_JUMB_STRING(relname);
> + JumbleExpr(jstate, (Node *) rte->tablesample, ctx);
> APP_JUMB(rte->inh);
> break;
>
> Have you done any benchmark on OLTP workload? Adding catalog access
> there is likely to add significant overhead.
Yes, I should do benchmarking. But I guess, main goal of Query ID is
monitoring, that can be switched off, if necessary.
This part made for a demo. It can be replaced by a hook, for example.
>
> Also, why only using the fully qualified relation name for stable
> hashes? At least operators and functions should also be treated the
> same way. If you do that you will probably have way too much overhead
> to be usable in a busy production environment. Why not using the new
> possibility of 3rd party extension for the queryid calculation that
> exactly suits your need?
>
I fully agree with these arguments. This code is POC. Main part here is
breaking down JumbleState, using a local context for subqueries and
sorting of a range table entries hashes.
I think, we can call one routine (APP_JUMB_OBJECT(), as an example) for
all oids in this code. It would allow an extension to intercept this
call and replace oid with an arbitrary value.

--
regards,
Andrey Lepikhov
Postgres Professional

In response to

Re: Make query ID more portable at 2021-10-12 08:35:39 from Julien Rouhaud

Responses

Re: Make query ID more portable at 2021-10-12 13:40:47 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alexander Kuzmenkov	2021-10-12 10:22:50	preserve timestamps when installing headers
Previous Message	Bharath Rupireddy	2021-10-12 09:27:58	Re: Inconsistency in startup process's MyBackendId and procsignal array registration with ProcSignalInit()