Re: 64.4.2. Bottom-up Index Deletion

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: hus(dot)mhd(at)gmail(dot)com, pgsql-docs(at)lists(dot)postgresql(dot)org
Subject: Re: 64.4.2. Bottom-up Index Deletion
Date: 2022-11-09 22:20:16
Message-ID: CAKFQuwZ7=V8yXQB00KRNgDfQ=_8WMcw9WFmqYnhETfWENzC-6A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

On Mon, Nov 7, 2022 at 5:20 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:

> Hi Hussein,
>
> Apologies for the very delayed response. I'm aware that you've taken
> an interest in this subject as part of your YouTube channel. Thanks
> for publicizing the work!
>
> On Tue, Jul 12, 2022 at 7:14 PM PG Doc comments form
> <noreply(at)postgresql(dot)org> wrote:
> > Would be nice to add a note: old tuple versions in the index referencing
> the
> > same logical row cannot be deleted by bottom up index deletion process
> when
> > older transactions that might require the old state the row are still
> > running
>
> It's really hard to write documentation for something like this,
> because it's difficult to decide what your audience really needs to
> know. I agree that it's important to get this specific point across,
> though. In fact I thought that I already conveyed the same idea at
> this point:
>
> "All indexes will need a successor physical index tuple that points to
> the latest version in the table. Each new tuple within each index will
> generally need to coexist with the original “updated” tuple for a
> short period of time (typically until shortly after the UPDATE
> transaction commits)."
>
> The implication is that we need the old version to coexist until after
> the updater transaction commits and is seen by every possible MVCC
> snapshot as having committed -- nobody sees the old version anymore.
> Maybe we could augment the existing sentences I have highlighted?
> Could it be more explicit?
>

I'm having trouble finding any major issues with the present wording.
Though it seems to be assuming the reader holds sufficient MVCC knowledge
to understand the import of "until shortly after the UPDATE transaction
commits". Maybe a bit more explicitness is in order.

On the point of "will generally need to coexist" - I don't see why we are
being wishy-washy here, though.

When updating a row where bottom-up deletion is chosen the most recent
tuple cannot be removed to make room for the new tuple; in particular,
because the current update may not commit.

I'm also not inherently understanding how the bottom-up pass can know a
tuple is safe to remove based upon visibility information when that
information is not present in the index AND it doesn't rely upon LP_DEAD.

A bit nit-picky but I think relevant to the above confusion:

"B-Tree indexes incrementally delete" - is it really the index
self-modifying or is it an active user session taking some time to perform
each pass? Describing it as, say:

"The updating session will locate all the logically equivalent tuples (on
the same page) via the index and check them for global visibility, removing
those that it finds that are both older than the most recent tuple and no
longer visible to all other sessions."

David J.

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Peter Geoghegan 2022-11-09 23:28:56 Re: 64.4.2. Bottom-up Index Deletion
Previous Message Maciek Sakrejda 2022-11-09 05:52:56 Re: Usability ideas: text width and headers that are links