From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Removing alignment padding for byval types |
Date: | 2019-10-31 18:48:21 |
Message-ID: | 20191031184821.qhgzlus2xmbvqpyl@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
We currently align byval types such as int4/8, float4/8, timestamp *,
date etc, even though we mostly don't need to. When tuples are deformed,
all byval types are copied out from the tuple data into the
corresponding Datum array, therefore the original alignment in the tuple
data doesn't matter. This is different from byref types, where the
Datum formed will often be a pointer into the tuple data.
While there are some older systems where it could be a bit slower to
copy data out from unaligned positions into the datum array, this is
more than bought back by the next point:
The fact that these types are aligned has substantial costs:
For one, we often waste substantial amounts of space inside tables with
alignment padding. It's not uncommon to see about 30% or more of space
wasted (especially when taking alignment of the first column into
account).
For another, and this I think is less obvious, we actually waste
substantial amounts of CPU maintaining the alignment. This is primarily
the case because we have to perform to align the pointer to the next
field during tuple [de]forming. Those instructions [1] have to be
executed taking time, but what's worse, they also reduce the ability of
out-of-order execution to hide latencies. There's a hard dependency on
knowing the offset to the next column to be able to continue with the
next column.
There's two reasons why we can't just set the alignment for these types
to 'c'.
1) pg_upgrade, for fairly obvious reasons
2) We map catalog table rows to structs, in a *lot* of places.
It seems to me that, despite the above, it's still worth trying to
improve upon the current state, to benefit from reduced space and CPU
usage.
As it turns out we already separate out the alignment for a type, and a
column, between pg_type.typalign and pg_attribute.attalign. One way to
tackle this would be to allow to specify, for byval types only, at
column creation time whether a column uses a 'struct-mappable' alignment
or not. When set, set attalign to pg_type.typalign for alignment, when
not, to 'c'. By changing pg_dump in binary upgrade mode to emit the
necessary options, and by adding such options during bki processing,
we'd solve 1) and 2), but otherwise gain the benefits.
Alternatively we could declare such a propert on the table level, but
that seems more restrictive, without a corresponding upside.
It's possible that we should do something related with a few varlena
datatypes. We currently use intalign for types like text, json, and as
far as I can tell that does not make all that much sense. They're not
struct mappable *anyway* (and if they were, they'd need to be 8 byte
aligned on common platforms, __alignof__(void*) is 8). We'd have to take
a bit of care to treat the varlena header as unaligned - but we need to
do so anyway, because of 1byte varlenas. Short varlenas seems to make it
less crucial to pursue this, as the average datum that'd benefit is long
enough to make padding a non-issue. So I don't think it'd make sense to
tackle this project at the same time.
To fully benefit from the increased tuple deforming speed, it might be
beneficial to branch very early between two different versions within
slot_deform_heap_tuple, having determined whether there's any byval
columns with alignment requirements at slot creation /
ExecSetSlotDescriptor() time (or even set a different callback
getsomeattrs callback, but that's a bit more complicated).
Thoughts?
Independent of the above, I think it might make sense to replace
pg_attribute.attalign with a smallint or such. It's a bit absurd that we
need code like
#define att_align_nominal(cur_offset, attalign) \
( \
((attalign) == 'i') ? INTALIGN(cur_offset) : \
(((attalign) == 'c') ? (uintptr_t) (cur_offset) : \
(((attalign) == 'd') ? DOUBLEALIGN(cur_offset) : \
( \
AssertMacro((attalign) == 's'), \
SHORTALIGN(cur_offset) \
))) \
)
instead of just using TYPEALIGN(). There's no need to adapt CREATE TYPE,
or pg_type - we should just store the number in
pg_attribute.attalign. That keeps CREATE TYPE 32/64bit/arch independent,
doesn't require reconstructing c/s/i/d in pg_dump, simplifies the
generated code [1], and would also "just" work for what I described
earlier in this email.
Greetings,
Andres Freund
[1] E.g. as a function of (void *ptr, char attalign) this ends up with assembly
like
.globl alignme
.type alignme, @function
alignme:
.LFB210:
.cfi_startproc
movq %rdi, %rax
cmpb $105, %sil
je .L496
cmpb $99, %sil
je .L492
addq $7, %rax
leaq 1(%rdi), %rdi
andq $-8, %rax
andq $-2, %rdi
cmpb $100, %sil
cmovne %rdi, %rax
.L492:
ret
.p2align 4,,10
.p2align 3
.L496:
addq $3, %rax
andq $-4, %rax
ret
.cfi_endproc
using (void *ptr, int8 attalign) instead yields:
.globl alignme2
.type alignme2, @function
alignme2:
.LFB211:
.cfi_startproc
movzbl %sil, %esi
leal -1(%rsi), %eax
cltq
negl %esi
addq %rax, %rdi
movslq %esi, %rax
andq %rdi, %rax
ret
.cfi_endproc
From | Date | Subject | |
---|---|---|---|
Next Message | Stephen Frost | 2019-10-31 18:50:35 | Re: Proposal: Global Index |
Previous Message | Fabien COELHO | 2019-10-31 18:46:30 | Re: pgbench - extend initialization phase control |