Re: Odd behavior with pg_stat_statements and queries called from SQL functions

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Maciek Sakrejda <m(dot)sakrejda(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Odd behavior with pg_stat_statements and queries called from SQL functions
Date: 2022-11-17 09:30:23
Message-ID: CAExHW5s2-reivuntZ=AWwUFpxisgEGVHqrZRPuv+3ugB249Q9g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 17, 2022 at 2:44 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
>
> Hi,
>
> On Wed, Nov 16, 2022 at 11:26:09PM -0800, Maciek Sakrejda wrote:
> > I noticed an odd behavior today in pg_stat_statements query
> > normalization for queries called from SQL-language functions. If I
> > have three functions that call an essentially identical query (the
> > functions are only marked SECURITY DEFINER to prevent inlining):
> >
> > maciek=# create or replace function f1(f1param text) returns text
> > language sql as 'select f1param' security definer;
> > CREATE FUNCTION
> > maciek=# create or replace function f2(f2param text) returns text
> > language sql as 'select f2param' security definer;
> > CREATE FUNCTION
> > maciek=# create or replace function f3(text) returns text language sql
> > as 'select $1' security definer;
> > CREATE FUNCTION
> > [...]
> > maciek=# select queryid, query, calls from pg_stat_statements where
> > queryid = 6741491046520556186;
> > queryid | query | calls
> > ---------------------+----------------+-------
> > 6741491046520556186 | select f1param | 3
> > (1 row)
> >
> > If I call f3 first, then f2 and f1, I get
> >
> > maciek=# select queryid, query, calls from pg_stat_statements where
> > queryid = 6741491046520556186;
> > queryid | query | calls
> > ---------------------+-----------+-------
> > 6741491046520556186 | select $1 | 3
> > (1 row)
> >
> > I understand that the query text may be captured differently for
> > different queries that map to the same id, but it seems confusing that
> > parameter names referenced in queries called from functions are not
> > normalized away, since they're not germane to the query execution
> > itself, and the context of the function is otherwise stripped away by
> > this point. I would expect that all three of these queries end up in
> > pg_stat_statements with the query text "select $1".Thoughts?
>
> None of those queries actually contain any constant, so the query text is just
> saved as-is in all the versions.
>
> I'm not sure that doing normalization for parameters would give way better
> results. It's true that a parameter name can change between different
> functions running the exact same statements, but is it really likely to happen?

Multiple functions running the same query is quite possible. I am
wondering why it took so long to identify this behaviour.

> And what if the two functions have different number of parameters in different
> orders? $1 could mean different things in different cases, and good luck
> finding out which one it is. At least with the parameter name you have a
> chance to figure out what the parameter was exactly.
>
Reporting one of the parameters as is is a problem yes. Can the
parameters be converted into some normailzed form like replacing
parameters with ? (or some constant string indicating parameter)
everywhere.

--
Best Wishes,
Ashutosh Bapat

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2022-11-17 09:31:46 Re: [PoC] Reducing planning time when tables have many partitions
Previous Message Julien Rouhaud 2022-11-17 09:14:26 Re: Odd behavior with pg_stat_statements and queries called from SQL functions