RE: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET

From: Bykov Ivan <i(dot)bykov(at)modernsys(dot)ru>
To: Michael Paquier <michael(at)paquier(dot)xyz>, David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Sami Imseih <samimseih(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET
Date: 2025-03-17 07:33:42
Message-ID: b220da2ce9a446bd90ff48ed9508786f@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello, Michael!

> So, here is attached a counter-proposal, where we can simply added a
> counter tracking a node count in _jumbleNode() to add more entropy to
> the mix, incrementing it as well for NULL nodes.

It definitely looks like a more reliable solution than my variant, which only
counts NULL nodes.

However, we already knew about the overhead of adding `\0` bytes for
every NULL field.

> So that adds about 9.1% overhead to jumbling, on average.

See:
https://www.postgresql.org/message-id/flat/5ac172e0b77a4baba50671cd1a15285f%40localhost.localdomain#6c43f354f5f42d2a27e6824faa660a86

Is it really worth spending extra execution time to increase entropy
when we have non-NULL nodes?

Maybe we should choose to add node_count to the hash every time we visit
non-NULL or NULL nodes.
We could also add entropy if we see a change in the node->type value for
non-NULL variants.

Your Variant
------------

< node_count = 1 > < node 1 >
< node_count = 2 > /* node 2 = NULL */
< node_count = 3 > < node 3 >

Alternative 1 (mark only NULL Nodes)
------------------------------------

/* node_count = 1 */ < node 1 >
< node_count = 2 > /* node 2 = NULL */
/* node_count = 3 */ < node 3 >

Alternative 2 (mark only non-NULL Nodes)
----------------------------------------
This could address concerns about problems related to visiting nodes with the
same content placed in different query tree branches.

< node_count = 1 > < node 1 >
/* node_count = 2 */ /* node 2 = NULL */
< node_count = 3 > < node 3 >

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jakub Wartak 2025-03-17 07:43:55 Re: BitmapHeapScan streaming read user and prelim refactoring
Previous Message Jakub Wartak 2025-03-17 07:28:46 Re: Draft for basic NUMA observability