Re: possible bug in handling of contrecords in dd38ff28ad (Fix recovery_prefetch with low maintenance_io_concurrency)

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: possible bug in handling of contrecords in dd38ff28ad (Fix recovery_prefetch with low maintenance_io_concurrency)
Date: 2023-07-02 18:12:42
Message-ID: 10616819-5333-e35e-32c8-c0d138542167@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7/2/23 04:09, Thomas Munro wrote:
> On Sun, Jul 2, 2023 at 1:40 AM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>> I think there's some sort of bug in how dd38ff28ad deals with
>> contrecords. Consider something as simple as
>>
>> pgbench -i -s 100
>>
>> and then doing pg_waldump on the WAL segments, I get this for every
>> single one:
>>
>> pg_waldump: error: error in WAL record at 0/1FFFF98: missing
>> contrecord at 0/1FFFFE0
>>
>> This only happens since dd38ff28ad, and revert makes it disappear.
>>
>> It's possible we still have some issue with contrecords, but IIUC we
>> fixed those. So unless there's some unknown one (and considering this
>> seems to happen for *every* WAL segment that's hard to believe), this
>> seems more like an issue in the error detection.
>
> Yeah. That message is due to this part of dd38ff28ad's change:
>
> Also add an explicit error message for missing contrecords. It was a
> bit strange that we didn't report an error already, and became a latent
> bug with prefetching, since the internal state that tracks aborted
> contrecords would not survive retrying, as revealed by
> 026_overwrite_contrecord.pl with this adjustment. Reporting an error
> prevents that.
>
> We can change 'missing contrecord' back to silent end-of-decoding (as
> it was in 14) with the attached. Here [1] is some analysis of this
> error that I wrote last year. The reason for my hesitation in pushing
> a fix was something like this:
>
> 1. For pg_waldump, it's "you told me to decode only this WAL segment,
> so decoding failed here", which is useless noise
> 2. For walsender, it's "you told me to shut down, so decoding failed
> here", which is useless noise
> 3. For crash recovery, "I ran out of data, so decoding failed here",
> which seems like a report-worthy condition, I think?
>
> When I added that new error I was thinking about that third case. We
> generally expect to detect the end of WAL replay after a crash with an
> error ("invalid record length ...: wanted 24, got 0" + several other
> possibilities), but in this rare case it would be silent. The effect
> on the first two cases is cosmetic, but certainly annoying. Perhaps I
> should go ahead and back-patch the attached change, and then we can
> discuss whether/how we should do a better job of distinguishing "user
> requested artificial end of decoding" from "unexpectedly ran out of
> data" for v17?
>

Yeah, I think that'd be sensible. IMHO we have a habit of scaring users
with stuff that might be dangerous/bad, but 99% of the time it's
actually fine and perhaps even expected. It's almost as if we're
conditioning people to ignore errors.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2023-07-02 18:13:41 Re: Memory leak in incremental sort re-scan
Previous Message Andres Freund 2023-07-02 18:09:44 Re: Replacing abort() with __builtin_trap()?