Quick Links

Re: 2019-03 CF Summary / Review - Tranche #2

From:	Peter Geoghegan <pg(at)bowt(dot)ie>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 2019-03 CF Summary / Review - Tranche #2
Date:	2019-02-16 06:52:41
Message-ID:	CAH2-Wz=TTUpuRS1X26o1vBCEET3Y+ncwswmYfDy3XsUErx7PuA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, Feb 15, 2019 at 9:45 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> - Make nbtree keys unique by appending heap TID, suffix truncation
>
> NR: Important, seemingly carefully worked on feature. But it's
> *complicated* stuff, and there's only been a bit of design review by
> Heikki. The author's a committer. I don't feel qualified to judge
> this.

I think that's fair. The best way of understanding the risks as a
non-expert is to think of the patch as having two distinct components:

1. Make nbtree entries unique + add suffix truncation -- the basic mechanism.

2. Make more intelligent decisions about where to split pages, to work
with suffix truncation, while still mostly caring about the balance of
free space among each half of the split, and caring about a number of
other new concerns besides these two.

You're right that some parts of the patch are very complicated, but
those are all contained in the second component. That has been the
main focus of Heikki's review, by far. This second component is
concerned with picking a split point that is already known to be legal
based on the *existing* criteria. If there were bugs here, they could
not result in data corruption. The worst outcome I can conceive of is
that an index would be bloated in a new and novel way. It would be
possible to correct that in a point release without breaking on-disk
compatibility. That would be painful, certainly, but it's far from the
worst outcome.

Granted, there are also one or two subtle things in the first, more
critical component, but these are also the things that were
established earliest and have received the most testing. And, amcheck
is now capable of doing point lookups using the same code paths as
index scans (calling _bt_search()) to relocate each and every tuple on
the leaf level, starting from the root. The first component does not
change anything about how crash recovery or VACUUM works, either. It's
all about how keys compare, and how new pivot tuples are generated --
it's mostly about the key space, while changing very little about the
physical on-disk representation. (It builds on the on-disk
representation changes added in Postgres 11, for INCLUDE indexes.)

> - SortSupport implementation on inet/cdir
>
> WOA: This is a substantial amount of code submitted first for the last
> CF.
>
> Andres: punt to v13

I was kind of involved here. I think that it's fair to punt, based on
the rule about submitting a big patch to the last CF.

> - amcheck verification for GiST
>
> WOA: Some locking changes are apparently needed, possibly a bit too
> much to fix up for v12?

I had hoped that Andrey Borodin would get back to me on this soon. It
does still seem unsettled.

> - insensitive/non-deterministic collations
>
> NR: Peter has stated that he's targeting v12, but I'm not sure this
> had enough review? But it's not *that* complicated...

I could help out here.

--
Peter Geoghegan

In response to

Re: 2019-03 CF Summary / Review - Tranche #2 at 2019-02-16 05:45:26 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2019-02-16 07:05:08	Re: 2019-03 CF Summary / Review - Tranche #2
Previous Message	Pavel Stehule	2019-02-16 06:02:50	Re: 2019-03 CF Summary / Review - Tranche #2