Re: FSM corruption leading to errors

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FSM corruption leading to errors
Date: 2016-10-18 04:01:12
Message-ID: CABOikdOqLzbPNaxgcJL0-jfNDpb7xLAFJ4ZyHKysDYOwMUYnfQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 17, 2016 at 4:34 PM, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:

>
>>
> visibilitymap_truncate is actually also wrong, in a different way. The
> truncation WAL record is written only after the VM (and FSM) are truncated.
> But visibilitymap_truncate() has already modified and dirtied the page. If
> the VM page change is flushed to disk before the WAL record, and you crash,
> you might have a torn VM page and a checksum failure.
>
>
Right, I missed the problem with checksums.

> Simply replacing the MarkBufferDirtyHint() call with MarkBufferDirty() in
> FreeSpaceMapTruncateRel would have the same issue. If you call
> MarkBufferDirty(), you must WAL-log the change, and also set the page's LSN
> to make sure the WAL record is flushed first.
>
>
I'm not sure I fully understand this. If the page is flushed before the WAL
record, we might have a truncated FSM where as the relation truncation is
not replayed. But that's not the same problem, right? I mean, you might
have an FSM which is not accurate, but it won't at the least return the
blocks outside the range. Having said that, I agree your proposed changes
are more robust.

BTW any thoughts on race-condition on the primary? Comments at
MarkBufferDirtyHint() seems to suggest that a race condition is possible
which might leave the buffer without the DIRTY flag, but I'm not sure if
that can only happen when the page is locked in shared mode. While most of
the reports so far involved standbys, and the bug can also hit a standalone
master during crash recovery, I wonder if a race can occur even on a live
system.

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2016-10-18 04:01:55 Re: Add PGDLLEXPORT to PG_FUNCTION_INFO_V1
Previous Message Craig Ringer 2016-10-18 03:59:39 Re: Idempotency for all DDL statements