Re: Normalize queries starting with SET for pg_stat_statements

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Greg Sabino Mullane <htamfids(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Normalize queries starting with SET for pg_stat_statements
Date: 2024-09-24 07:57:28
Message-ID: ZvJw6OZl22DPlwfP@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 19, 2024 at 03:28:52PM +0900, Michael Paquier wrote:
> FWIW, I'm OK with hiding the value when it comes to a SET clause in a
> CREATE FUNCTION. We already hide the contents of SQL queries inside
> the SQL functions when these are queries that can be normalized, so
> there is a kind of thin argument for consistency, or something close
> to that.

This thread was one of the things I wanted to look at for this commit
fest because that's an issue I have on my stack for some time. And
here you go with a revised patch set.

First, the TAP test proposed upthread is not required, so let's remove
it and rely on pg_stat_statements. It is true that there are a lot of
coverage holes in pg_stat_statements with the various flavors of SET
queries. The TAP test was able to cover a good part of them, still
missed a few spots.

A second thing is that like you I have settled down to a custom
implementation for VariableSetStmt because we have too many grammar
patterns, some of them with values nested in clauses (SET TIME ZONE is
one example), and we should report the grammar keywords without
hardcoding a location. Like for the TIME ZONE part, this comes to a
limitation because it is not possible to normalize the case of SET
TIME ZONE 'value' without some refactoring of gram.y. Perhaps we
could do that in the long-term, but I come down to the fact that I'm
also OK with letting things as they are in the patch, because the
primary case I want to tackle at the SET name = value patterns, and
SET TIME ZONE is just a different flavor of that that can be
translated as well to the most common "name = value" pattern. A
second case is SET SESSION CHARACTERISTICS AS TRANSACTION with its
list of options. Contrary to the patch proposed, I don't think that
it is a good idea to hide that arguments may be included in the
jumbling in the custom function, so I have added one field in
VariableSetStmt to do that, and documented why we need the field in
parsenodes.h. That strikes me as the best balance, and that's going
to be hard to miss each time somebody adds a new grammar for
VariableSetStmt. That's very unlikely at this stage of the project,
but who knows, people like fancy new features.

Attached are two patches:
- 0001 adds a bunch of tests in pg_stat_statements, to cover the SET
patterns. (typo in commit message of this patch, will fix later)
- 0002 is the addition of the normalization. It is possible to see
how the normalization changes things in pg_stat_statements.

0001 is straight-forward and that was I think a mistake to not include
that from the start when I've expanded these tests in the v16 cycle
(well, time..). 0002 also is quite conservative at the end, and this
design can be used to tune easily the jumbling patterns from gram.y
depending on the feedback we'd get.
--
Michael

Attachment Content-Type Size
v2-0001-pg_stta_statements-Expand-tests-with-SET-grammar.patch text/x-diff 10.2 KB
v2-0002-Normalize-queries-starting-with-SET.patch text/x-diff 12.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Denis Laxalde 2024-09-24 08:19:07 Re: Proposal: allow database-specific role memberships
Previous Message Vladlen Popolitov 2024-09-24 07:19:13 Re: Increase of maintenance_work_mem limit in 64-bit Windows