From: | Noah Misch <noah(at)leadboat(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: XLogReadRecord() error in XlogReadTwoPhaseData() |
Date: | 2021-11-18 07:05:06 |
Message-ID: | 20211118070506.GA1294133@rfd.leadboat.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Nov 17, 2021 at 05:47:10PM -0500, Tom Lane wrote:
> Noah Misch <noah(at)leadboat(dot)com> writes:
> > Each of the three failures happened on a sparc64 Debian+gcc machine. I had
> > tried ~8000 iterations on thorntail, another sparc64 Debian+gcc animal,
> > without reproducing this.
> # 'pgbench: error: client 0 script 1 aborted in command 4 query 0: ERROR: could not read two-phase state from WAL at 0/159EF88: unexpected pageaddr 0/0 in log segment 000000010000000000000001, offset 5890048
> [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tadarida&dt=2021-11-17%2013%3A01%3A24
Two others:
ERROR: could not read two-phase state from WAL at 0/16F1850: invalid record length at 0/16F1850: wanted 24, got 0
-- https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tadarida&dt=2021-11-12%2013%3A01%3A15
ERROR: could not read two-phase state from WAL at 0/1668020: incorrect resource manager data checksum in record at 0/1668020
-- https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=kittiwake&dt=2021-11-16%2015%3A00%3A52
> I suppose "unexpected pageaddr 0/0" is most easily explained by supposing
> that XlogReadTwoPhaseData tried to read a WAL page that hadn't been
> written out yet. Have we got any synchronization around that?
If the WAL address isn't on disk, that error doesn't happen. Instead,
read_local_xlog_page() blocks waiting for the WAL to become available. It's
still possible that we make the WAL region exist, but it somehow doesn't
contain the right data until shortly later. FinishPreparedTransaction() takes
TwoPhaseStateLock and looks for an entry having gxact->valid. EndPrepare()
fills gxact->prepare_end_lsn, then calls MarkAsPrepared() to set gxact->valid
under TwoPhaseStateLock. All freelist (freeGXacts) interaction holds
TwoPhaseStateLock. I'm not seeing a gap in that synchronization.
I don't have a great theory, but here are candidates to examine next:
- Run with wal_debug=on to confirm logged write location matches read location.
- Run "PGDATA=contrib/amcheck/tmp_check/t_003_cic_2pc_CIC_2PC_test_data/pgdata
pg_waldump -s 0/01000000" at the end of the test.
- Dump WAL page binary image at the point of failure.
- Log which branches in XLogReadRecord() are taken.
What else might help?
From | Date | Subject | |
---|---|---|---|
Next Message | osumi.takamichi@fujitsu.com | 2021-11-18 07:22:15 | RE: Optionally automatically disable logical replication subscriptions on error |
Previous Message | Michael Paquier | 2021-11-18 06:58:18 | Re: pg_upgrade test for binary compatibility of core data types |