Re: 64.4.2. Bottom-up Index Deletion

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: hus(dot)mhd(at)gmail(dot)com, pgsql-docs(at)lists(dot)postgresql(dot)org
Subject: Re: 64.4.2. Bottom-up Index Deletion
Date: 2022-11-10 00:06:15
Message-ID: CAKFQuwZrvHrCUEk4SNKKVoX5RDuv-rVXn1sf=2ChC6ug9QAVoQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

On Wed, Nov 9, 2022 at 4:29 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:

> On Wed, Nov 9, 2022 at 2:20 PM David G. Johnston
> <david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
> > Maybe a bit more explicitness is in order.
>
> Yeah, maybe.
>
> > On the point of "will generally need to coexist" - I don't see why we
> are being wishy-washy here, though.
>
> Because sometimes it will take a relatively long time (say in the
> presence of a long running xact/snapshot). With an aborted transaction
> there will be no wait for the tuple to become eligible to
> remove/delete at all (even in the presence of a long running
> xact/shapshot).
>

Ok, so the "most recent live tuple" - this in an update, there has to be
one. Can that live tuple that is being updated ever be removed by this
process?

I'm not sure what the long-running transaction has to do with this though -
we are in an update so we've locked an active tuple and that is the only
tuple this coexistence is talking about. The other sentence tries to cover
the remaining ones with a general "best effort" hand-wave which is OK.

>
> > When updating a row where bottom-up deletion is chosen the most recent
> tuple cannot be removed to make room for the new tuple; in particular,
> because the current update may not commit.
>
> Sort of. Could also be multiple still-needed tuples.
>

That there are others doesn't support or invalidate the sentence though.
"most recent live tuple" - though (i.e., the one being updated).

>
> In fact it's possible that it'll be a tuple that became garbage as a
> result of a DELETE statement run during a committed transaction -- we
> need only visit the same heap page during the processing that takes
> place on the heapam side.
>

You lost me here. Which tuple is "it" - the one we are updating got
deleted under us?

> > I'm also not inherently understanding how the bottom-up pass can know a
> tuple is safe to remove based upon visibility information when that
> information is not present in the index AND it doesn't rely upon LP_DEAD.
>
> It goes to the heap to get it.
>

ok

>
> > "B-Tree indexes incrementally delete" - is it really the index
> self-modifying or is it an active user session taking some time to perform
> each pass? Describing it as, say:
> >
> > "The updating session will locate all the logically equivalent tuples
> (on the same page) via the index and check them for global visibility,
> removing those that it finds that are both older than the most recent tuple
> and no longer visible to all other sessions."
>
> But it actually just deletes whatever tuples are determined to be safe
> to delete. The B-Tree side of this has some very vague idea about how
> updates and MVCC version churn works, but it is quite unsure of
> itself.
>

Ah, OK, I somehow got into my head that only duplicates of the row being
updated mattered (I think the "deduplication process" drove me to this, and
your comment "The nice thing about bottom-up index deletion is that it
reliably places the burden of performing deletion on the specific
non-HOT updaters
that are responsible for bloating the index.")

So, the page that the update has to be placed on gets picked for cleanup
but the actual fetching of rows and such doesn't actually care what that
tuple is. Though its ancestors are likely to be picked.

> I don't think that these subtleties need to be documented. But it's
> difficult to know where to draw the line.
>
>
Knowing for certain whether the cleanup scope is "just prior versions of my
update" or "any old versions on the page my update just happened to be
placed on" seems user-facing worthy.

I'm almost thinking the last sentence "Each new tuple within each
index....for a short period of time..." is doing more harm than good.
Removing it makes things clearer.

"...latest version in the table".

(next paragraph)

"The no-longer-visible versions of these index entries are removed by three
separate processes. Vacuum (see chapter), simple index tuple deletion (see
note), and bottom-up deletion (covered in the rest of this section).
Bottom-up index deletion is an incremental version churn deletion process,
triggered by an anticipated "version churn page split". This only
happens.....(the rest can probably remain the same, though I suggest moving
the note to the end of the section)."

Adding "no-longer-visible" at the start should hopefully be a sufficient
reminder for someone familiar with MVCC; or a clue that more reading of
pre-requisite features is needed before understanding this internal part of
the system.

David J.

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Peter Geoghegan 2022-11-10 00:34:04 Re: 64.4.2. Bottom-up Index Deletion
Previous Message Peter Geoghegan 2022-11-09 23:28:56 Re: 64.4.2. Bottom-up Index Deletion