Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Julien Rouhaud <rjuju123(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, Atsushi Torikoshi <atorik(at)gmail(dot)com>, Tatsuro Yamada <tatsuro(dot)yamada(dot)tf(at)nttcom(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Evgeny Efimkin <efimkin(at)yandex-team(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?
Date: 2020-10-06 13:22:29
Message-ID: 20201006132229.GB3686@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 6, 2020 at 02:34:58PM +0900, Michael Paquier wrote:
> On Mon, Oct 05, 2020 at 11:23:50PM -0400, Bruce Momjian wrote:
> > On Tue, Oct 6, 2020 at 11:11:27AM +0800, Julien Rouhaud wrote:
> >> Maybe we could add a new hook for only queryid computation, and add a
> >> GUC to let people choose between no queryid computed, core computation
> >> (current pg_stat_statement) and 3rd party plugin?
> >
> > That all seems very complicated. If we go in that direction, I suggest
> > we just give up getting any of this into core.
>
> A GUC would have at least the advantage to make the computation
> consistent for any system willing to consume it, with the option to
> not pay any potential performance impact, though I have to admit that
> just moving the query ID computation of PGSS into core may not be the
> best option as a query ID of 0 means the same thing for a utility, for
> an initialization, and for a backend running a query with an unknown
> value, but that could be worked out.
>
> FWIW, I think that adding the system ID in the hash is too
> restrictive, as it could be interesting for users to do stat
> comparisons across multiple systems running the same major version.
> It would be better to not give any strong guarantee that the query ID
> computed will remain consistent across major versions so as it is
> possible to keep improving it. Also, if nothing has been done that
> changes the hashing computation, I see little benefit in forcing a
> breakage by adding something like PG_MAJORVERSION_NUM or such in the
> hash computation.

I thought some more about this. First, I think having the queryid hash
code in the server, without requiring pg_stat_statements, is a
requirement --- I think too many people will want to use this feature
independent of pg_stat_statements. Second, I understand the desire to
have different hash computation methods, depending on what level of
detail/matching you want.

I propose moving the pg_stat_statements queryid hash code into the
server (with a version number), and also adding a postgressql.conf
variable that lets you control how detailed the queryid hash is
computed. This addresses the problem of people wanting different hash
methods.

When computing a hash, the queryid detail level and version number will
be mixed into the hash, so only a hash that used a similar query and
identical queryid detail level would match.

--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EnterpriseDB https://enterprisedb.com

The usefulness of a cup is in its emptiness, Bruce Lee

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jaime Casanova 2020-10-06 13:28:04 Re: enable_incremental_sort changes query behavior
Previous Message Heikki Linnakangas 2020-10-06 13:19:31 Re: Yet another fast GiST build