From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
Cc: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, tgl(at)sss(dot)pgh(dot)pa(dot)us, robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: avoid multiple hard links to same WAL file after a crash |
Date: | 2022-05-05 11:10:02 |
Message-ID: | YnOwioN/mo6k1Dtb@paquier.xyz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, May 02, 2022 at 04:06:13PM -0700, Nathan Bossart wrote:
> Here is a new patch set. For now, I've only removed the file existence
> check in writeTimeLineHistoryFile(). I don't know if I'm totally convinced
> that there isn't a problem here (e.g., due to concurrent .ready file
> creation), but since some platforms have been using rename() for some time,
> I don't know how worried we should be.
That's only about Windows these days, meaning that there is much less
coverage in this code path.
> I thought about adding some kind of
> locking between the WAL receiver and startup processes, but that seems
> excessive.
Agreed.
> Alternatively, we could just fix xlog.c as proposed earlier
> [0]. AFAICT that is the only caller that can experience problems due to
> the multiple-hard-link issue. All other callers are simply renaming a
> temporary file into place, and the temporary file can be discarded if left
> behind after a crash.
I'd agree with removing all the callers at the end. pgrename() is
quite robust on Windows, but I'd keep the two checks in
writeTimeLineHistory(), as the logic around findNewestTimeLine() would
consider a past TLI history file as in-use even if we have a crash
just after the file got created in the same path by the same standby,
and the WAL segment init part. Your patch does that.
--
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | vignesh C | 2022-05-05 11:14:14 | Re: Handle infinite recursion in logical replication setup |
Previous Message | Bharath Rupireddy | 2022-05-05 09:27:29 | Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?) |