Quick Links

Re: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET

From:	David Rowley <dgrowleyml(at)gmail(dot)com>
To:	Bykov Ivan <i(dot)bykov(at)modernsys(dot)ru>
Cc:	Michael Paquier <michael(at)paquier(dot)xyz>, Sami Imseih <samimseih(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET
Date:	2025-03-10 11:45:35
Message-ID:	CAApHDvpCg-JeZVJ9kMYL-c-iVwK2YTRP6+tDic56D7HV0ZLtNQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, 10 Mar 2025 at 23:17, Bykov Ivan <i(dot)bykov(at)modernsys(dot)ru> wrote:
> I have attached the updated Variant A patch with the following changes:
> - Implemented a pg_stat_statements regression test (see similar test results
> on REL_17_3 in Query-ID-collision-at_pg_stat_statements-1ea6e890b22.txt).
> - Added extra description about the Query ID collision
> in src/include/nodes/parsenodes.h.

I've not paid much attention to the jumble code before, but I did
glance at this patch:

+ *
+ * The query jumbling process may trigger a Query ID collision when
+ * two Query fields located sequentially contain the same type of nodes
+ * (a list of nodes), and both of them may be NULL.
+ * List of such groups of fields:
+ * - limitOffset and limitCount (which may be an int8 expression or NULL);
+ * - distinctClause, and sortClause (which may be a list of
+ * SortGroupClause entries or NULL);
+ * To prevent this collision, we should insert at least one scalar field
+ * without the attribute query_jumble_ignore between such fields.
*/
typedef struct Query
{
@@ -208,11 +218,11 @@ typedef struct Query

List *distinctClause; /* a list of SortGroupClause's */

- List *sortClause; /* a list of SortGroupClause's */
-
Node *limitOffset; /* # of result tuples to skip (int8 expr) */
- Node *limitCount; /* # of result tuples to return (int8 expr) */
LimitOption limitOption; /* limit type */
+ Node *limitCount; /* # of result tuples to return (int8 expr) */
+
+ List *sortClause; /* a list of SortGroupClause's */

List *rowMarks; /* a list of RowMarkClause's */

It seems to me that if this fixes the issue, that the next similar one
is already lurking in the shadows waiting to jump out on us.

Couldn't we adjust all this code so that we pass a seed to
AppendJumble() that's the offsetof(Struct, field) that we're jumbling
and incorporate that seed into the hash? I don't think we could
possibly change the offset in a minor version, so I don't think
there's a risk that could cause jumble value instability. Possibly
different CPU architectures would come up with different offsets
through having different struct alignment requirements, but we don't
make any guarantees about the jumble values matching across different
CPU architectures anyway.

I admit to only having spent a few minutes looking at this, so there
may well be a good reason for not doing this. Maybe Michael can tell
me what that is...?

A very quickly put together patch is attached. I intend this only to
demonstrate what I mean, not because I want to work on it myself.

David

Attachment	Content-Type	Size
v1-0001-A-very-rough-proof-of-concept-to-fix-the-problem-.patch	application/octet-stream	6.8 KB

In response to

RE: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET at 2025-03-10 10:17:45 from Bykov Ivan

Responses

Re: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET at 2025-03-10 15:49:21 from Sami Imseih
Re: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET at 2025-03-11 06:50:11 from Michael Paquier

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alvaro Herrera	2025-03-10 11:50:23	Re: Support NOT VALID / VALIDATE constraint options for named NOT NULL constraints
Previous Message	Nisha Moond	2025-03-10 11:31:55	Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility.