Re: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Sami Imseih <samimseih(at)gmail(dot)com>, Ivan Bykov <I(dot)Bykov(at)modernsys(dot)ru>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET
Date: 2025-03-24 02:22:43
Message-ID: Z-DB83o5AZmgnvnA@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 21, 2025 at 02:45:09PM +0900, Michael Paquier wrote:
> Potential ideas about optimizing the branches seem worth
> investigating, I am not sure that I would be able to dig into that
> until the feature freeze gong, unfortunately, and it would be nice to
> fix this issue for this release. I'll see about doing some pgbench
> runs on my side, as well, with tpc-b Querys.

So, here are some numbers with a "SELECT 1;" using pgbench, which
would be a worst-case. Then, at 128 clients on a machine I have
around:
- HEAD, compute_query_id = on:
tps = 282081.839715
tps = 281379.854539
tps = 283708.205096
tps = 280197.225782 #worst
tps = 291955.914664 #best
tps = 284172.351609
tps = 287234.624720
tps = 287915.683291 #best
tps = 277004.123742 #worst
tps = 281326.071323
- Patch, compute_query_id = on:
tps = 289746.641246 #best
tps = 279689.654196
tps = 284874.774620 #best
tps = 282312.246610
tps = 275807.057758 #worst
tps = 276058.858384 #worst
tps = 282705.638685
tps = 282651.367641
tps = 276777.418569
tps = 283137.265298

Removing the two best and two worst numbers, taking an average as in:
(head_tps - patch_tps) * 2 / (head_tps + patch_tps)
Then that's 283317.15 (HEAD) vs 281212.26 (patch) so a 0.7% decrease.
Still that could be claimed as noise.

Second set, at 1 client, and a better pattern becomes available:
- HEAD, compute_query_id = on:
tps = 19170.473320 #best
tps = 18722.828797
tps = 19138.765391
tps = 18243.611593
tps = 17878.605004 #worst
tps = 19557.753327 #best
tps = 18395.734885
tps = 18433.188326
tps = 18227.914138 #worst
tps = 18450.236841
- Patch, compute_query_id = on:
tps = 16611.766411
tps = 15992.734615 #worst
tps = 16465.077482
tps = 16726.097318
tps = 17284.874897 #best
tps = 16076.378905
tps = 17971.534072 #best
tps = 15733.718159 #worst
tps = 16911.014660
tps = 17217.202679

And here we get 18564.06 (head) vs 16667.92 (patch) so a 10.7%
difference in this run. Hence the automatic addition of NULL to the
jumbling is disappointing here, even if this is what I'd see this as a
worst case scenario, unlikely what one would see for real if they care
about monitoring.

So, on the other hand, we still have the trick to switch the fields,
which means that we would not suffer any performance penalty, while
fixing the original issue. And with the feature freeze in sight, time
is running short if we want to do something here.

NB: In case, a second run with 1 client showing the same trend:
- HEAD:
tps = 19697.406181
tps = 19474.495729
tps = 18614.615630
tps = 18868.245926
tps = 18949.775164
tps = 18943.311921
tps = 18824.192383
tps = 19104.542301
tps = 18950.870646
tps = 19331.829745
- Patch:
tps = 17887.864764
tps = 17317.628065
tps = 17090.278323
tps = 15876.258607
tps = 16818.644938
tps = 16816.334313
tps = 18134.148809
tps = 16891.809962
tps = 18193.104125
tps = 16866.726990
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2025-03-24 03:26:50 Re: support virtual generated column not null constraint
Previous Message Hayato Kuroda (Fujitsu) 2025-03-24 02:21:43 RE: pg_recvlogical requires -d but not described on the documentation