From: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
---|---|
To: | michael(at)paquier(dot)xyz |
Cc: | zsolt(dot)ero(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: could not link file in wal restore lines |
Date: | 2022-07-25 08:11:32 |
Message-ID: | 20220725.171132.2272594383346737093.horikyota.ntt@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
At Sat, 23 Jul 2022 12:36:47 +0900, Michael Paquier <michael(at)paquier(dot)xyz> wrote in
> FWIW, the backend code has protections to prevent *exactly* this kind
> of problems when recycling WAL segment files at checkpoints with a set
> of LWLocks taken on the control file, for one. Perhaps you have
> messed up things and you have finished in such a state that backrest
> writes to pg_wal/ concurrently with a cluster running and running a
> checkpoint, which would explain those link() calls to be failing?
That lock doesn't seem excluding recovery.
I can reproduce with the following script (see below) with some sleep
is added before (or after) durable_link_or_rename call in
InstallXlogFileSegment (attached). Some adjustment might be required
to reproduce the same on other environment.
=====
2022-07-25 17:05:57.730 JST [151758] LOG: restored log file "000000010000000000000057" from archive
2022-07-25 17:05:57.760 JST [151758] LOG: restored log file "000000010000000000000058" from archive
2022-07-25 17:05:57.782 JST [151758] LOG: restored log file "000000010000000000000059" from archive
2022-07-25 17:05:57.790 JST [151762] LOG: could not link file "pg_wal/000000010000000000000002" to "pg_wal/000000010000000000000059": File exists
2022-07-25 17:05:57.802 JST [151758] LOG: restored log file "00000001000000000000005A" from archive
2022-07-25 17:05:58.294 JST [151762] LOG: could not link file "pg_wal/000000010000000000000003" to "pg_wal/00000001000000000000005A": File exists
========
#! /bin/bash
# create a backup-source
PGDATA=~/test/data
PGARC=~/test/arc
BKDIR=~/test/bk
CPDATA=~/test/dt
rm /tmp/hoge
rm -r $PGDATA $PGARC $BKDIR $CPDATA
mkdir $PGARC
killall -9 postgres
initdb -D $PGDATA
echo "archive_mode=on" >> $PGDATA/postgresql.conf
echo "archive_command = 'cp %p $PGARC/%f'" >> $PGDATA/postgresql.conf
#start the source
pg_ctl -D $PGDATA start
# take a backup
pg_basebackup -D $BKDIR
echo "archive_mode=off" >> $BKDIR/postgresql.conf
echo "restore_command='cp $PGARC/%f %p'" >> $BKDIR/postgresql.conf
touch $BKDIR/recovery.signal
# create archived segments
psql -c 'create table t (a int)'
for i in $(seq 1 100); do psql -c 'insert into t values(0); select pg_switch_wal()'; done
#stop the source
pg_ctl -D $PGDATA stop
# start recovery
rm -rf $CPDATA
cp -r $BKDIR $CPDATA
touch /tmp/hoge
postgres -D $CPDATA 2>&1 | tee recovery.log
======
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachment | Content-Type | Size |
---|---|---|
repro20220725.diff | text/x-patch | 577 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Kyotaro Horiguchi | 2022-07-25 08:25:52 | Re: could not link file in wal restore lines |
Previous Message | Marco Boeringa | 2022-07-25 06:04:52 | Re: Fwd: "SELECT COUNT(*) FROM" still causing issues (deadlock) in PostgreSQL 14.3/4? |