From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Sami Imseih <samimseih(at)gmail(dot)com> |
Cc: | Christoph Berg <myon(at)debian(dot)org>, Lukas Fittl <lukas(at)fittl(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, ma lz <ma100(at)hotmail(dot)com> |
Subject: | Re: query_id: jumble names of temp tables for better pg_stat_statement UX |
Date: | 2025-03-25 23:10:16 |
Message-ID: | Z-M32Cm7FM-1gpxB@paquier.xyz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers |
On Tue, Mar 25, 2025 at 11:58:21AM -0500, Sami Imseih wrote:
> "Since the queryid hash value is computed on the post-parse-analysis
> representation of the queries, the opposite is also possible: queries with
> identical texts might appear as separate entries, if they have different
> meanings as a result of factors such as different search_path settings."
>
> I think this text could remain as is, because search_path still
> matters for things like functions, etc.
Yeah, I think that's OK as-is. I'm open to more improvements,
including more tests for these function patterns. It's one of these
areas where we should be able to tweak RangeTblFunction and apply a
custom function to its funcexpr, and please note that I have no idea
how complex it could become as this is a Node expression. :D
Functions in a temporary schema is not something as common as temp
tables, I guess, so these matter less, but they would still be a cause
of bloat for monitoring in very specific workloads.
> 2/
> "For example, pg_stat_statements will consider two apparently-identical
> queries to be distinct, if they reference a table that was dropped and
> recreated between the executions of the two queries."
>
> This is no longer true for relations, but is still true for functions. I think
> we should mention the caveats in a bit more detail as this change
> will have impact on the most common case. What about something
> like this?
>
> "For example, pg_stat_statements will consider two apparently-identical
> queries to be distinct, if they reference a function that was dropped and
> recreated between the executions of the two queries.
That's a bit larger than functions, but we could remain a bit more
evasive, with "if they reference *for example* a function that was
dropped and recreated between the executions of the two queries".
Note that for DDLs, like CREATE TABLE, we also group entries with
identical relation names, so we are kind of in line with the patch,
not with the current docs.
> Conversely, if a table is dropped and recreated between the
> executions of queries, two apparently-identical queries may be
> considered the same. However, if the alias for a table is different
> for semantically similar queries, these queries will be considered distinct"
This addition sounds like an improvement here.
As this thread has proved, we had little coverage these cases in pgss,
so I've applied the tests as an independent change. It is also useful
to track how things change in the commit history depending on how the
computation is tweaked. I've also included your doc suggestions. I
feel that we could do better here, but that's a common statement
anyway when it comes to the documentation.
--
Michael
Attachment | Content-Type | Size |
---|---|---|
v6-0001-Add-custom-query-jumble-function-for-RangeTblEntr.patch | text/x-diff | 5.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2025-03-25 23:24:20 | Re: query_id: jumble names of temp tables for better pg_stat_statement UX |
Previous Message | Tom Lane | 2025-03-25 22:55:34 | Re: Q on SELECT column list pushdown from view to table |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2025-03-25 23:24:20 | Re: query_id: jumble names of temp tables for better pg_stat_statement UX |
Previous Message | Daniel Gustafsson | 2025-03-25 23:06:30 | Re: Cannot find a working 64-bit integer type on Illumos |