From: | Noah Misch <noah(at)leadboat(dot)com> |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org, dilipbalaut(at)gmail(dot)com |
Cc: | robertmhaas(at)gmail(dot)com |
Subject: | STRATEGY=WAL_LOG missing checkpoint interlocks and sync |
Date: | 2024-01-30 19:50:03 |
Message-ID: | 20240130195003.0a.nmisch@google.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
(1) CreateDirAndVersionFile() uses a code sequence: XLogInsert(), XLogFlush(),
write(PG_VERSION). This is missing the interlock with checkpoints, which
pd_lsn usually secures. This breaks when one takes a base backup between the
XLogInsert() and the write(). The transam/README "action first and then write
a WAL entry" strategy works well. createdb_failure_callback() will remove the
whole directory on failure.
(2) RelationMapCopy() reasons that it doesn't need RelationMappingLock.
CheckPointRelationMap() relies on RelationMappingLock to avoid completing a
checkpoint between the WAL and the sync. A base backup taken in the middle of
write_relmap_file() has no relmap after recovery.
(3) CreateDirAndVersionFile() populates the PG_VERSION file without syncing
it. An OS crash after the next checkpoint may leave PG_VERSION empty or
missing. Let's fsync the file immediately. I used LazyFS to confirm the
defect and fix. Though I was skeptical about adding a wait event in back
branches, I plan to do so anyway. Commit 368ffde got away with it, and we do
have this pattern of the sync event being separate from the write event.
I'm also attaching a test for (1) and (2), but I plan not to commit it. If
you want to see the bugs in action, you might find it informative. I
abandoned it when this complication felt like it would take awhile to resolve:
+ # FIXME While this is an effective test of XLOG_DBASE_CREATE_WAL_LOG, the
+ # XLOG_RELMAP_UPDATE makes the backup's checkpoint hang waiting for
+ # RelationMappingLock. To have a test that both fails with the bug and
+ # passes with its fix, we'd need a procedure like this:
+ #
+ # while (backup client still not done)
+ # {
+ # run backup client until it's waiting on a lock CREATE DATABASE holds;
+ # unpause CREATE DATABASE and re-pause it after its next lock release;
+ # }
As a generalization of that test, it would be great to have one that tries a
base backup after every XLogInsert() of a CREATE DATABASE or even every
XLogInsert() of a src/test/regress run. Reintroducing (1) or (2) isn't too
likely, but some part of the system could have or acquire similar bugs.
Thanks,
nm
Attachment | Content-Type | Size |
---|---|---|
WAL_LOG-1-chkpt-interleave-v1.patch | text/plain | 4.9 KB |
WAL_LOG-2-PG_VERSION-sync-v1.patch | text/plain | 2.2 KB |
test-WAL_LOG-chkpt-interleave-v0.patch | text/plain | 14.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Ivan Kalafatić | 2024-01-30 20:22:12 | Re: BUG #18315: Segmentation fault on connection (repmgr, psql) |
Previous Message | PG Bug reporting form | 2024-01-30 15:16:45 | BUG #18315: Segmentation fault on connection (repmgr, psql) |