Re: Make tuple deformation faster

From: Andres Freund <andres(at)anarazel(dot)de>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Make tuple deformation faster
Date: 2024-10-03 21:23:33
Message-ID: ftqlghermbpeewc5kci4jpe7yxe3kaccqtqvhuqs3zuhytfacp@eeeo44x2f73e
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

A few weeks ago David and I discussed this patch. We were curious *why* the
flags approach was slower. It turns out that, at least on my machine, this is
just a compiler optimization issue. Putting a pg_compiler_barrier() just
after:

> for (; attnum < natts; attnum++)
> {
> - Form_pg_attribute thisatt = TupleDescAttr(tupleDesc, attnum);
> + CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);

addressed the issue. The problem basically is that instead of computing the
address of thisatt once, gcc for some reason instead uses complex addressing
lea instructions, which are slower on some machines.

Not sure what a good way to deal with that is. I haven't tried, but it could
be that just advancing thisatt using++ would do the trick?

I think this generally looks quite good and the performance wins are quite
nice! I'm a bit concerned about the impact on various extensions - I haven't
looked, but I wouldn't be surprised if this requires a bunch of of changes.
Perhaps we could reduce that a bit?

Could it make sense to use bitfields instead of flag values, to reduce the
impact?

> From 35a6cdc2056decc31d67ad552826b573d2b66073 Mon Sep 17 00:00:00 2001
> From: David Rowley <dgrowley(at)gmail(dot)com>
> Date: Tue, 28 May 2024 20:10:50 +1200
> Subject: [PATCH v3 2/5] Introduce CompactAttribute array in TupleDesc
>
> This array stores a subset of the fields of FormData_pg_attribute,
> primarily the ones for deforming tuples, but since we have additional
> space, pack a few additional boolean columns in the attflags field.
>
> Many areas of the code can get away with only accessing the
> CompactAttribute array, which because the details of each attribute is
> stored much more densely than FormData_pg_attribute, many operations can
> be performed accessing fewer cachelines which can improve performance.
>
> This also makes pg_attribute.attcacheoff redundant. A follow-on commit
> will remove it.

Yay.

> diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
> index 2c435cdcb2..478ebbe1f4 100644
> --- a/src/include/access/tupdesc.h
> +++ b/src/include/access/tupdesc.h
> @@ -45,6 +45,46 @@ typedef struct TupleConstr
> bool has_generated_stored;
> } TupleConstr;
>
> +/*
> + * CompactAttribute
> + * Cut-down version of FormData_pg_attribute for faster access for tasks
> + * such as tuple deformation.
> + */
> +typedef struct CompactAttribute
> +{
> + int32 attcacheoff; /* fixed offset into tuple, if known, or -1 */

For a short while I thought we could never have a large offset here, due to
toast, but that's only when tuples come from a table...

> + int16 attlen; /* attr len in bytes or -1 = varlen, -2 =
> + * cstring */

It's hard to imagine fixed-dith types longer than 256 bits long making sense,
but even if we wanted to change that, it'd not be sensible to change that in
this patch...

> +#ifdef USE_ASSERT_CHECKING
> +
> +/*
> + * Accessor for the i'th CompactAttribute of tupdesc. In Assert enabled
> + * builds we verify that the CompactAttribute is populated correctly.
> + * This helps find bugs in places such as ALTER TABLE where code makes changes
> + * to the FormData_pg_attribute but forgets to call populate_compact_attribute
> + */
> +static inline CompactAttribute *
> +TupleDescCompactAttr(TupleDesc tupdesc, int i)
> +{
> + CompactAttribute snapshot;
> + CompactAttribute *cattr = &tupdesc->compact_attrs[i];
> +
> + /*
> + * Take a snapshot of how the CompactAttribute is now before calling
> + * populate_compact_attribute to make it up-to-date with the
> + * FormData_pg_attribute.
> + */
> + memcpy(&snapshot, cattr, sizeof(CompactAttribute));
> +
> + populate_compact_attribute(tupdesc, i);
> +
> + /* reset attcacheoff back to what it was */
> + cattr->attcacheoff = snapshot.attcacheoff;
> +
> + /* Ensure the snapshot matches the freshly populated CompactAttribute */
> + Assert(memcmp(&snapshot, cattr, sizeof(CompactAttribute)) == 0);
> +
> + return cattr;
> +}
> +
> +#else
> +/* Accessor for the i'th CompactAttribute of tupdesc */
> +#define TupleDescCompactAttr(tupdesc, i) (&(tupdesc)->compact_attrs[(i)])
> +#endif

Personally I'd have the ifdef inside the static inline function, rather than
changing between a static inline and a macro depending on
USE_ASSERT_CHECKING..

> #ifndef FRONTEND
> /*
> - * Given a Form_pg_attribute and a pointer into a tuple's data area,
> + * Given a CompactAttribute pointer and a pointer into a tuple's data area,
> * return the correct value or pointer.
> *
> * We return a Datum value in all cases. If the attribute has "byval" false,
> @@ -43,7 +43,7 @@ att_isnull(int ATT, const bits8 *BITS)
> *
> * Note that T must already be properly aligned for this to work correctly.
> */
> -#define fetchatt(A,T) fetch_att(T, (A)->attbyval, (A)->attlen)
> +#define fetchatt(A, T) fetch_att(T, CompactAttrByVal(A), (A)->attlen)

Stuff like this seems like it might catch some extensions unaware. I think it
might make sense to change macros like this to be a static inline, so that you
get proper type mismatch errors, rather than errors about invalid casts or
nonexistant fields.

> From c75d072b8bc43ba4f9b7fbe2f99b65edc7421a15 Mon Sep 17 00:00:00 2001
> From: David Rowley <dgrowley(at)gmail(dot)com>
> Date: Wed, 29 May 2024 12:19:03 +1200
> Subject: [PATCH v3 3/5] Optimize alignment calculations in tuple form/deform
>
> This converts CompactAttribute.attalign from a char which is directly
> derived from pg_attribute.attalign into a uint8 which specifies the
> number of bytes to align the column by. Also, rename the field to
> attalignby to make the distinction more clear in code.
>
> This removes the complexity of checking each char value and transforming
> that into the appropriate alignment call. This can just be a simple
> TYPEALIGN passing in the number of bytes.

I like this a lot.

> diff --git a/contrib/amcheck/verify_heapam.c b/contrib/amcheck/verify_heapam.c
> index 08772de39f..b66eb178b9 100644
> --- a/contrib/amcheck/verify_heapam.c
> +++ b/contrib/amcheck/verify_heapam.c
> @@ -1592,7 +1592,7 @@ check_tuple_attribute(HeapCheckContext *ctx)
> /* Skip non-varlena values, but update offset first */
> if (thisatt->attlen != -1)
> {
> - ctx->offset = att_align_nominal(ctx->offset, thisatt->attalign);
> + ctx->offset = att_nominal_alignby(ctx->offset, thisatt->attalignby);
> ctx->offset = att_addlength_pointer(ctx->offset, thisatt->attlen,
> tp + ctx->offset);
> if (ctx->tuphdr->t_hoff + ctx->offset > ctx->lp_len)

A bit confused about the change in naming policy here...

> From d1ec19a46a480b0c75f9df468b2765ad4e51dce2 Mon Sep 17 00:00:00 2001
> From: David Rowley <dgrowley(at)gmail(dot)com>
> Date: Tue, 3 Sep 2024 14:05:30 +1200
> Subject: [PATCH v3 5/5] Try a larger CompactAttribute struct without flags
>
> Benchmarks have shown that making the CompactAttribute struct larger and
> getting rid of the flags to reduce the bitwise-ANDing requirements makes
> things go faster.

I think we have some way of not needing this. I don't like my compiler barrier
hack, but I'm sure we can hold the hands of the compiler sufficiently to
generate useful code.

But this made me wonder:

> ---
> src/backend/access/common/tupdesc.c | 21 ++++++----------
> src/include/access/tupdesc.h | 39 ++++++++++-------------------
> 2 files changed, 20 insertions(+), 40 deletions(-)
>
> diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
> index 74f22cffb9..95c92e6585 100644
> --- a/src/backend/access/common/tupdesc.c
> +++ b/src/backend/access/common/tupdesc.c
> @@ -67,20 +67,13 @@ populate_compact_attribute(TupleDesc tupdesc, int i)
> dst->attcacheoff = -1;
> dst->attlen = src->attlen;
>
> - dst->attflags = 0;
> -
> - if (src->attbyval)
> - dst->attflags |= COMPACT_ATTR_FLAG_BYVAL;
> - if (src->attstorage != TYPSTORAGE_PLAIN)
> - dst->attflags |= COMPACT_ATTR_FLAG_IS_PACKABLE;
> - if (src->atthasmissing)
> - dst->attflags |= COMPACT_ATTR_FLAG_HAS_MISSING;
> - if (src->attisdropped)
> - dst->attflags |= COMPACT_ATTR_FLAG_IS_DROPPED;
> - if (src->attgenerated)
> - dst->attflags |= COMPACT_ATTR_FLAG_IS_GENERATED;
> - if (src->attnotnull)
> - dst->attflags |= COMPACT_ATTR_FLAG_IS_NOTNULL;
> +
> + dst->attbyval = src->attbyval;
> + dst->attispackable = (src->attstorage != TYPSTORAGE_PLAIN);
> + dst->atthasmissing = src->atthasmissing;
> + dst->attisdropped = src->attisdropped;
> + dst->attgenerated = src->attgenerated;
> + dst->attnotnull = src->attnotnull;

It'd sure be nice if we could condense some of these fields in pg_attribute
too. It obviously shouldn't be a requirement for this patch, don't get me
wrong!

Production systems have often very large pg_attribute tables, which makes this
a fairly worthwhile optimization target.

I wonder if we could teach the bootstrap and deform logic about bool:1 or such
and have it generate the right code for that.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2024-10-03 21:27:40 Re: Should rolpassword be toastable?
Previous Message Nathan Bossart 2024-10-03 21:13:41 Re: Enable data checksums by default