Re: Jumble the CALL command in pg_stat_statements

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)fr>
Subject: Re: Jumble the CALL command in pg_stat_statements
Date: 2023-09-13 04:37:55
Message-ID: ZQE8o9x7YUWidtzi@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 13, 2023 at 12:48:48AM +0000, Imseih (AWS), Sami wrote:
> The patch also modifies existing test cases for CALL handling in pg_stat_statements
> and adds additional tests which prove that a CALL to an overloaded procedure
> will generate a different query_id.

+CALL overload(1);
+CALL overload('A');
[...]
+ 1 | 0 | CALL overload($1)
+ 1 | 0 | CALL overload($1)

That's not surprising to me. We've historically relied on the
function OID in the jumbling of a FuncExpr, so I'm OK with that. This
may look a bit surprising though if you have a schema that enforces
the same function name for several data types. Having a DEFAULT does
this:
CREATE OR REPLACE PROCEDURE overload(i text, j bool DEFAULT true) AS
$$ DECLARE
r text;
BEGIN
SELECT i::text INTO r;
END; $$ LANGUAGE plpgsql;

Then with these three, and a jumbling based on the OID gives:
+CALL overload(1);
+CALL overload('A');
+CALL overload('A', false);
[...]
- 1 | 0 | CALL overload($1)
+ 2 | 0 | CALL overload($1)

Still this grouping is much better than having thousands of entries
with different values. I am not sure if we should bother improving
that more than what you suggest that, especially as FuncExpr->args can
itself include Const nodes as far as I recall.

> As far as the SET command mentioned in [1] is concerned, it is a bit more complex
> as it requires us to deal with A_Constants which is not very straightforward. We can surely
> deal with SET currently by applying custom query jumbling logic to VariableSetStmt,
> but this can be dealt with in a separate discussion.

As VariableSetStmt is the top-most node structure for SET/RESET
commands, using a custom implementation may be wise in this case,
particularly for the args made of A_Const. I don't really want to go
down to the internals of A_Const outside its internal implementation,
as these can be used for some queries where there are multiple
keywords separated by whitespaces for one single A_Const, like
isolation level values in transaction commands. This would lead to
appending the dollar-based variables in weird ways for some patterns.
Cough.

/* transformed output-argument expressions */
- List *outargs pg_node_attr(query_jumble_ignore);
+ List *outargs;

This choice is a bit surprising. How does it influence the jumbling?
For example, if I add a query_jumble_ignore to it, the regression
tests of pg_stat_statements still pass. This is going to require more
test coverage to prove that this addition is useful.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-09-13 04:44:43 Re: subscription TAP test has unused $result
Previous Message Andres Freund 2023-09-13 04:14:12 Re: Query execution in Perl TAP tests needs work