From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Victor Yegorov <vyegorov(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Make tuple deformation faster |
Date: | 2024-12-27 13:02:45 |
Message-ID: | CAApHDvpCu+fYqh5rKVgXmn_Pfc_ObMwa7cSX8DJ_jHzv23q2Cg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, 20 Dec 2024 at 23:31, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> The attcacheoff removal is now pushed. I've attached the two remaining patches.
So there's still one remaining patch in this series. I delayed a bit
to test this further as I wondered if it was worth adding another
inlined version of slot_deform_heap_tuple_internal() for tuples
without HeapTupleHasVarWidth. After some benchmarking, it seems it's
not always better. Also, after looking at how my compiler implemented
the switch in fetch_att() for the different byval sizes, I
experimented with a patch to mask out the upper portion of the datum
for byval types smaller than 8 bytes with a static lookup table. I
don't think Valgrind will like that, plus I think that method might
only work on little-endian machines.
Using the attached deform_test2.sh, I ran the tests on 3 different
machines. With both AMD machines, I used both gcc and clang. The 6
graphs in the attached screenshot show the results of the 3 different
tests. The left column of graphs is the TPS result and the right
column is the percentage increase with patched vs master. The first
row of graphs is a 16 column table without NULLs and all fixed-width
columns. The 2nd row has 16 all fixed width columns but the first
column is NULL. The 3rd has a varlena first column and then 15
fixed-width columns and no NULLs.
I propose to commit the 0001 patch only. The performance increase
seems nice at around 5-20% with my tests. The 0002 patch adds the
extra speciality function for tuples with only fixed-width attributes.
There are some performance regressions with this patch, so not
planning on using that as it is. I'm planning on trying another
approach as I think there's quite a lot of performance left with tuple
deforming / forming. I'm planning on experimenting with having the
TupleDesc always populate the attcacheoff for the leading fixed-width
columns and the first varlena column and storing the attnum of the
first variable-length attribute in the TupleDesc (i.e the final column
to have a valid attcacheoff). This means it'll be possible to use the
fixed-width deforming up to the first variable length attr according
to the TupleDesc and always use attcacheoff for that. This also has
the advantage of being quite good for functions such as
heap_compute_data_size() as we already know the position of the first
NULL (if any) when that's called from somewhere like
heap_form_minimal_tuple(). This means we only have to calculate the
size from the first variable length attribute or the first NULL
(whichever comes first), and we can start the size calc at the
attcacheoff for that attribute and only add the size needed for the
remaining columns. For tuples with no NULLs and only fixed-width
types, that basically means heap_compute_data_size() returns
attcacheoff + attlen of the final column. No looping. nocachegetattr()
can also be improved similarly.
Happy to hear any thoughts on any of the above. I am planning on
pushing 0001 soon.
David
Attachment | Content-Type | Size |
---|---|---|
deform_test2.sh.txt | text/plain | 2.2 KB |
v9-0001-Speedup-tuple-deformation-with-additional-functio.patch | application/octet-stream | 11.1 KB |
v9-0002-Add-special-case-tuple-deform-code-for-no-varwidt.patch | application/octet-stream | 7.3 KB |
v9-0003-Make-fetch_att-faster-on-little-endian-hardware.patch | application/octet-stream | 2.0 KB |
v9_results.png | image/png | 329.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Gustafsson | 2024-12-27 13:22:32 | Re: add support for the old naming libs convention on windows (ssleay32.lib and libeay32.lib) |
Previous Message | Ilia Evdokimov | 2024-12-27 12:53:35 | Re: Exists pull-up application with JoinExpr |