Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Pawel Kudzia <kudzia(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows
Date: 2021-07-18 01:24:42
Message-ID: CAH2-WzkkEdEQJrje3EpWZn26m605Odcv+eGL=f8bKcTxKUo-dQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Sat, Jul 17, 2021 at 4:25 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > And a
> > gin btree page won't be reused prematurely as a pending list page,
> > because ginNewBuffer() checks GinPageIsRecyclable(), even if the page is
> > being reused as a pending list page.
>
> It doesn't matter that ginNewBuffer() checks GinPageIsRecyclable(),
> since the page header's pd_prune_xid field will be 0 for a pending
> list page (same as other index AMs, though not the same as other GIN
> code). Note that GinPageIsRecyclable() returns "true" (safe to
> recycle) when it sees any "pd_prune_xid = 0" page, by design. Also, if
> ginInsertCleanup() ever tries to process/delete a posting tree or
> entry tree page, then it's still not going to set pd_prune_xid in the
> first place (because it believes that they're pending list pages).

To be clear, I'm not worried about premature reuse of some other page
as a pending list page -- that should be fine. I'm worried about
unsafe early reuse of pending list pages, where the new page image is
related to either a posting tree or the entry tree. I think that it's
possible that the whole locking protocol can't handle premature
recycling, because it wasn't really built with that in mind.
shiftList() will call RecordFreeIndexPage() for pages it deletes as of
commit e956808328 from 2015. I find the code hard to follow, but I
suspect that some place cannot handle it if a
marked-GIN_DELETED-by-shiftList() page gets concurrently recycled.
Distant code may need to still have the same right link (and even the
same page contents?) to handle everything.

For example, distant code like scanGetCandidate() from ginget.c
participates in buffer lock coupling of pending list pages, and so has
interactions with ginInsertCleanup() and its own lock coupling. But
scanGetCandidate() hasn't changed since 2009, and might well have been
missed by the numerous bug fixes that changed scanGetCandidate() code
following commit e956808328 from 2015. Nearby code in
scanPendingInsert() also hasn't been changed since 2009. But
ginInsertCleanup() *has* changed since that time, by quite a bit.
Starting with making shiftList() call RecordFreeIndexPage() for pages
it deletes. But other stuff too.

I might be mistaken in certain details, but the fact that it's so hard
to keep all the details straight concerns me.

--
Peter Geoghegan

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Thomas Munro 2021-07-18 05:13:50 Re: BUG #16696: Backend crash in llvmjit
Previous Message Peter Geoghegan 2021-07-17 23:25:54 Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows