Re: btree: implement dynamic prefix truncation (was: Improving btree performance through specializing by key shape, take 2)

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>, David Christensen <david(at)pgguru(dot)net>
Subject: Re: btree: implement dynamic prefix truncation (was: Improving btree performance through specializing by key shape, take 2)
Date: 2024-01-24 12:02:00
Message-ID: CAEze2Wj2r=n8bGNypd2U=yGu9rc=xQATEpfMRfya9H18f3amPw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 19 Jan 2024 at 05:55, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Wed, Nov 1, 2023 at 2:42 AM Matthias van de Meent
> <boekewurm+postgres(at)gmail(dot)com> wrote:
> >
> > Hi,
> >
> > Currently, nbtree code compares each and every column of an index
> > tuple during the binary search on the index page. With large indexes
> > that have many duplicate prefix column values (e.g. an index on (bool,
> > bool, uuid) ) that means a lot of wasted time getting to the right
> > column.
> >
> > The attached patch improves on that by doing per-page dynamic prefix
> > truncation: If we know that on both the right and left side there are
> > index tuples where the first two attributes are equal to the scan key,
> > we skip comparing those attributes at the current index tuple and
> > start with comparing attribute 3, saving two attribute compares. We
> > gain performance whenever comparing prefixing attributes is expensive
> > and when there are many tuples with a shared prefix - in unique
> > indexes this doesn't gain much, but we also don't lose much in this
> > case.
> >
> > This patch was originally suggested at [0], but it was mentioned that
> > they could be pulled out into it's own thread. Earlier, the
> > performance gains were not clearly there for just this patch, but
> > after further benchmarking this patch stands on its own for
> > performance: it sees no obvious degradation of performance, while
> > gaining 0-5% across various normal indexes on the cc-complete sample
> > dataset, with the current worst-case index shape getting a 60%+
> > improved performance on INSERTs in the tests at [0].
>
> +1 for the idea, I have some initial comments while reading through the patch.

Thank you for the review.

> 1.
> Commit message refers to a non-existing reference '(see [0])'.

Noted, I'll update that.

> 2.
> +When we do a binary search on a sorted set (such as a BTree), we know that a
> +tuple will be smaller than its left neighbour, and larger than its right
> +neighbour.
>
> I think this should be 'larger than left neighbour and smaller than
> right neighbour' instead of the other way around.

Noted, will be fixed, too.

> 3.
> +With the above optimizations, dynamic prefix truncation improves the worst
> +case complexity of indexing from O(tree_height * natts * log(tups_per_page))
> +to O(tree_height * (3*natts + log(tups_per_page)))
>
> Where do the 3*natts come from? Is it related to setting up the
> dynamic prefix at each level?

Yes: We need to establish prefixes for both a tuple that's ahead of
the to-be-compared tuple, and one that's after the to-be-compared
tuple. Assuming homogenous random distribution of scan key accesses
across the page (not always the case, but IMO a reasonable starting
point) this would average to 3 unprefixed compares before you have
established both a higher and a lower prefix.

> 4.
> + /*
> + * All tuple attributes are equal to the scan key, only later attributes
> + * could potentially not equal the scan key.
> + */
> + *compareattr = ntupatts + 1;
>
> Can you elaborate on this more? If all tuple attributes are equal to
> the scan key then what do those 'later attributes' mean?

In inner pages, tuples may not have all key attributes, as some may
have been truncated away in page splits. So, tuples that have at least
the same prefix as this (potentially truncated) tuple will need to be
compared starting at the first missing attribute of this tuple, i.e.
ntupatts + 1.

Kind regards,

Matthias van de Meent

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2024-01-24 12:15:55 Re: Make COPY format extendable: Extract COPY TO format implementations
Previous Message shveta malik 2024-01-24 11:47:01 Re: Synchronizing slots from primary to standby