From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
Cc: | "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Question: test "aggregates" failed in 32-bit machine |
Date: | 2022-10-01 19:13:59 |
Message-ID: | 961620.1664651639@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I wrote:
> So at this point I've lost all faith in the estimates being meaningful
> at all.
I spent some time today looking into the question of what our qsort
code actually does. I wrote a quick-n-dirty little test module
(attached) to measure the number of comparisons qsort really uses
for assorted sample inputs. The results will move a bit from run
to run because of randomization, but the average counts should be
pretty stable I think. I got results like these:
regression=# create temp table data as
select * from qsort_comparisons(100000);
SELECT 10
regression=# select n * log(groups)/log(2) as est, 100*(n * log(groups)/log(2) - avg_cmps)/avg_cmps as pct_err, * from data;
est | pct_err | n | groups | avg_cmps | min_cmps | max_cmps | note
--------------------+--------------------+--------+--------+----------+----------+----------+------------------------
0 | -100 | 100000 | 1 | 99999 | 99999 | 99999 | all values the same
1660964.0474436812 | -5.419880052975057 | 100000 | 100000 | 1756145 | 1722569 | 1835627 | all values distinct
100000 | -33.33911061041376 | 100000 | 2 | 150013 | 150008 | 150024 | 2 distinct values
400000 | 11.075628618635713 | 100000 | 16 | 360115 | 337586 | 431376 | 16 distinct values
600000 | 8.369757612975473 | 100000 | 64 | 553660 | 523858 | 639492 | 64 distinct values
800000 | 4.770461016221087 | 100000 | 256 | 763574 | 733898 | 844450 | 256 distinct values
1000000 | 1.5540821186618827 | 100000 | 1024 | 984697 | 953830 | 1111384 | 1024 distinct values
1457116.0087927429 | 41.97897366170798 | 100000 | 24342 | 1026290 | 994694 | 1089503 | Zipfian, parameter 1.1
1150828.9986140348 | 158.28880094758154 | 100000 | 2913 | 445559 | 426575 | 511214 | Zipfian, parameter 1.5
578135.9713524659 | 327.6090378488971 | 100000 | 55 | 135202 | 132541 | 213467 | Zipfian, parameter 3.0
(10 rows)
So "N * log(NumberOfGroups)" is a pretty solid estimate for
uniformly-sized groups ... except when NumberOfGroups = 1 ... but it
is a significant overestimate if the groups aren't uniformly sized.
Now a factor of 2X or 3X isn't awful --- we're very happy to accept
estimates only that good in other contexts --- but I still wonder
whether this is reliable enough to justify the calculations being
done in compute_cpu_sort_cost. I'm still very afraid that the
conclusions we're drawing about the sort costs for different column
orders are mostly junk.
In any case, something's got to be done about the failure at
NumberOfGroups = 1. Instead of this:
correctedNGroups = Max(1.0, ceil(correctedNGroups));
per_tuple_cost += totalFuncCost * LOG2(correctedNGroups);
I suggest
if (correctedNGroups > 1.0)
per_tuple_cost += totalFuncCost * LOG2(correctedNGroups);
else /* Sorting N all-alike tuples takes only N-1 comparisons */
per_tuple_cost += totalFuncCost;
(Note that the ceil() here is a complete waste, because all paths leading
to this produced integral estimates already. Even if they didn't, I see
no good argument why ceil() makes the result better.)
I'm still of the opinion that we need to revert this code for now.
regards, tom lane
Attachment | Content-Type | Size |
---|---|---|
count_comparisons.c | text/x-c | 7.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2022-10-01 19:26:30 | Re: Question: test "aggregates" failed in 32-bit machine |
Previous Message | Peter Geoghegan | 2022-10-01 16:53:59 | Re: pgsql: Avoid improbable PANIC during heap_update. |