Re: Use durable_unlink for .ready and .done files for WAL segment removal

From: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Use durable_unlink for .ready and .done files for WAL segment removal
Date: 2018-11-27 21:49:29
Message-ID: DE3D7AD9-45CC-4933-B409-CE6066595918@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/27/18, 3:20 PM, "Michael Paquier" <michael(at)paquier(dot)xyz> wrote:
> On Tue, Nov 27, 2018 at 08:43:06PM +0000, Bossart, Nathan wrote:
>> IIUC any time that the file does not exist, we will attempt to unlink
>> it. Regardless of whether unlinking fails or succeeds, we then
>> proceed to give up archiving for now, but it's not clear why. Perhaps
>> we should retry unlinking a number of times (like we do for
>> pgarch_archiveXlog()) when durable_unlink() fails and simply "break"
>> to move on to the next .ready file if durable_unlink() succeeds.
>
> Both suggestions sound reasonable to me. (durable_unlink is not called
> on HEAD in pgarch_archiveXlog). How about 3 retries with a in-between
> wait of 1s? That's consistent with what pgarch_ArchiverCopyLoop does,
> still I am not completely sure if we actually want to be consistent for
> the purpose of removing orphaned ready files.

That sounds good to me. I was actually thinking of using the same
retry counter that we use for pgarch_archiveXlog(), but on second
thought, it is probably better to have two independent retry counters
for these two unrelated operations.

Nathan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-11-27 21:52:07 Re: Use durable_unlink for .ready and .done files for WAL segment removal
Previous Message Justin Pryzby 2018-11-27 21:41:10 Re: pg11b1 from outside a txn: "VACUUM cannot run inside a transaction block": should be: ...or multi-command string