From: | Ranier Vilela <ranier(dot)vf(at)gmail(dot)com> |
---|---|
To: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com> |
Subject: | Re: PG15 beta1 sort performance regression due to Generation context change |
Date: | 2022-05-22 19:11:46 |
Message-ID: | CAEudQAoApw7T3wCv89oCF7tksR6Y8i2YTyprZZaRgYo8snzS2g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi David,
>Over the past few days I've been gathering some benchmark results
>together to show the sort performance improvements in PG15 [1].
>One of the test cases I did was to demonstrate Heikki's change to use
>a k-way merge (65014000b).
>The test I did to try this out was along the lines of:
>set max_parallel_workers_per_gather = 0;
>create table t (a bigint not null, b bigint not null, c bigint not
>null, d bigint not null, e bigint not null, f bigint not null);
>insert into t select x,x,x,x,x,x from generate_Series(1,140247142) x; --
10GB!
>vacuum freeze t;
>The query I ran was:
>select * from t order by a offset 140247142;
I redid this test here:
Windows 10 64 bits
msvc 2019 64 bits
RAM 8GB
SSD 256 GB
HEAD (default configuration)
Time: 229396,551 ms (03:49,397)
PATCHED:
Time: 220887,346 ms (03:40,887)
>I tested various sizes of work_mem starting at 4MB and doubled that
>all the way to 16GB. For many of the smaller values of work_mem the
>performance is vastly improved by Heikki's change, however for
>work_mem = 64MB I detected quite a large slowdown. PG14 took 20.9
>seconds and PG15 beta 1 took 29 seconds!
>I've been trying to get to the bottom of this today and finally have
>discovered this is due to the tuple size allocations in the sort being
>exactly 64 bytes. Prior to 40af10b57 (Use Generation memory contexts
>to store tuples in sorts) the tuple for the sort would be stored in an
>aset context. After 40af10b57 we'll use a generation context. The
>idea with that change is that the generation context does no
>power-of-2 round ups for allocations, so we save memory in most cases.
>However, due to this particular test having a tuple size of 64-bytes,
>there was no power-of-2 wastage with aset.
>The problem is that generation chunks have a larger chunk header than
>aset do due to having to store the block pointer that the chunk
>belongs to so that GenerationFree() can increment the nfree chunks in
>the block. aset.c does not require this as freed chunks just go onto a
>freelist that's global to the entire context.
>Basically, for my test query, the slowdown is because instead of being
>able to store 620702 tuples per tape over 226 tapes with an aset
>context, we can now only store 576845 tuples per tape resulting in
>requiring 244 tapes when using the generation context.
>If I had added column "g" to make the tuple size 72 bytes causing
>aset's code to round allocations up to 128 bytes and generation.c to
>maintain the 72 bytes then the sort would have stored 385805 tuples
>over 364 batches for aset and 538761 tuples over 261 batches using the
>generation context. That would have been a huge win.
>So it basically looks like I discovered a very bad case that causes a
>significant slowdown. Yet other cases that are not an exact power of
>2 stand to gain significantly from this change.
>One thing 40af10b57 does is stops those terrible performance jumps
>when the tuple size crosses a power-of-2 boundary. The performance
>should be more aligned to the size of the data being sorted now...
>Unfortunately, that seems to mean regressions for large sorts with
>power-of-2 sized tuples.
It seems to me that the solution would be to use aset allocations
when the size of the tuples is power-of-2?
if (state->sortopt & TUPLESORT_ALLOWBOUNDED ||
(state->memtupsize & (state->memtupsize - 1)) == 0)
state->tuplecontext = AllocSetContextCreate(state->sortcontext,
"Caller tuples", ALLOCSET_DEFAULT_SIZES);
else
state->tuplecontext = GenerationContextCreate(state->sortcontext,
"Caller tuples", ALLOCSET_DEFAULT_SIZES);
I took a look and tried some improvements to see if I had a better result.
Would you mind taking a look and testing?
regards,
Ranier Vilela
Attachment | Content-Type | Size |
---|---|---|
sort_bench.txt | text/plain | 10.1 KB |
002-improve-sort.patch | application/octet-stream | 5.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Nathan Bossart | 2022-05-22 20:26:08 | Re: docs: mention "pg_read_all_stats" in "track_activities" description |
Previous Message | Stephen Frost | 2022-05-22 14:12:58 | Re: Unsubscribing from this mailing list. |