Re: Exceptional md.c paths for recovery and zero_damaged_pages

From: Andres Freund <andres(at)anarazel(dot)de>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers(at)postgresql(dot)org, Noah Misch <noah(at)leadboat(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Exceptional md.c paths for recovery and zero_damaged_pages
Date: 2024-12-17 21:46:40
Message-ID: arzfrsobgv5ljw45vrj7zxb2ww2lurk7lgtquz6q2gwdq6fqse@mzjhjkir5lof
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2024-12-17 16:11:53 -0500, Peter Geoghegan wrote:
> On Tue, Dec 17, 2024 at 12:57 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> > Hmm, looking at index_fetch_heap(), I'm surprised it doesn't throw an
> > error or even a warning if the heap tuple isn't found. That would seem
> > like a useful sanity check. An index tuple should never point to a
> > non-existent heap TID I believe.
>
> I think that it is necessary. We need to be prepared to perform a TID
> lookup with a wild page offset number to account for concurrent TID
> recycling. At least with nbtree plain index scans.

ISTM that we could do better with some fairly simple cooperation between index
and table AM. It should be rather rare to look up a TID that was removed
between finding the index entry and fetching the table entry, a small amount
of extra work in that case ought to be ok.

Could we e.g., for logged tables, track the LSN of the leaf index page in the
IndexScanDesc and pass that to table_index_fetch_tuple() and only error out in
if the table relation has a newer LSN? That leaves a small window for a
false-negative, but shouldn't have any false-positives?

I've seen enough bugs / corruption leading to indexes pointing to wrong and
nonexisting tuples to make me think it's worth being a bit more proactive
about raising errors for such cases. Of course what I described above can
only detect index entries pointing to non-existing tuples, but that's still
a good bit better than nothing.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2024-12-17 21:47:31 Re: Crash: invalid DSA memory alloc request
Previous Message Tom Lane 2024-12-17 21:42:32 Re: Adding NetBSD and OpenBSD to Postgres CI