Re: btree: implement dynamic prefix truncation (was: Improving btree performance through specializing by key shape, take 2)

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, David Christensen <david(at)pgguru(dot)net>
Subject: Re: btree: implement dynamic prefix truncation (was: Improving btree performance through specializing by key shape, take 2)
Date: 2024-11-13 21:14:20
Message-ID: CAH2-Wz=Q7VnvA8AiB5ONeM7Xu4e9jKgyMmzK1r64gOTnGeRcFg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 13, 2024 at 3:30 PM Dmitry Dolgov <9erthalion6(at)gmail(dot)com> wrote:
> > On Tue, Aug 13, 2024 at 02:39:10PM GMT, Peter Geoghegan wrote:
> > On Tue, Aug 6, 2024 at 5:42 PM Matthias van de Meent
> > To be clear, this test involves bulk loading of an unlogged table (the
> > land registry table). The following composite index is created on the
> > table before we insert any rows, so most of the cycles here are in
> > index maintenance including _bt_search descents:
> >
> > CREATE INDEX composite ON land2 USING btree (county COLLATE "C", city
> > COLLATE "C", locality COLLATE "C");

> Under the danger of showing my ignorance, what is the definition of land
> registry benchmark? I think it would be useful if others could reproduce
> the results as well, especially if they're somewhat surprising.

It's a sample dataset that I've found useful from time to time,
particularly when testing nbtree features. Usually using a composite
index like the one I described.

One slightly useful (though far from unique) property of such an index
is that it contains data that's low cardinality (sometimes extremely
low cardinality) across multiple index columns. With variable-width
(text) index columns. That specific combination made the index a
decent test of certain issues affecting the nbtsplitloc.c split point
choice logic during work on Postgres 12 and 13. I believe that
Matthias independently found it useful on a number of other occasions,
too.

See https://wiki.postgresql.org/wiki/Sample_Databases for instructions
on how to set it up for yourself. You could probably come up with a
way of generating a similar dataset, without needing to download
anything, though. The fact that I found it useful in the past is at
least somewhat arbitrary.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David E. Wheeler 2024-11-13 21:38:26 Re: RFC: Extension Packaging & Lookup
Previous Message Dmitry Dolgov 2024-11-13 20:30:23 Re: btree: implement dynamic prefix truncation (was: Improving btree performance through specializing by key shape, take 2)