Re: XLogReadRecord() error in XlogReadTwoPhaseData()

From: Noah Misch <noah(at)leadboat(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: pgbf(at)twiska(dot)com, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: XLogReadRecord() error in XlogReadTwoPhaseData()
Date: 2022-01-22 18:52:41
Message-ID: 20220122185241.GA1095169@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jan 16, 2022 at 01:02:41PM -0800, Noah Misch wrote:
> My next steps:
>
> - Report a Debian bug for the sparc64+ext4 zeros problem.

(Not done yet.)

> - Try to falsify the idea that "write only the not-already-written portion of
> a WAL block" is an effective workaround. Specifically, modify the test
> program to have the writer process mutate offsets [N-k,N-1] and [N+1,N+k]
> while the reader process reads offset N. If the reader sees a zero, that
> workaround is ineffective.

The reader did not see a zero. In addition to bytes outside the write being
immune to the zeros bug, the first and last forty bytes of a write were immune
to the zeros bug.

> - Implement the workaround, if I didn't falsify its effectiveness. If it
> doesn't hurt performance on x86_64, we can use it unconditionally.
> Otherwise, limit its use to sparc64 Linux.

Attached. With this, kittiwake has survived 8.5hr of 003_cic_2pc.pl. Without
the patch, it failed many times, always within 1.3hr. For easier review, this
patch uses the new behavior on all platforms. Before commit and back-patch, I
plan to limit use of the new behavior to sparc Linux. Future work can
benchmark the new behavior and, if it performs well, make it unconditional in
v15+. I would expect performance to be unchanged or slightly better, because
the new behavior requests less futile work from the OS.

Attachment Content-Type Size
xlog-write-once-v1.patch text/plain 4.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2022-01-22 19:48:35 Re: autovacuum prioritization
Previous Message Tom Lane 2022-01-22 18:38:04 Re: relcache not invalidated when ADD PRIMARY KEY USING INDEX