From: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
---|---|
To: | "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com> |
Cc: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Speed up the removal of WAL files |
Date: | 2017-11-17 17:57:23 |
Message-ID: | CAHGQGwHXd+39svrz8KZZwKOY6VdwQqXQbMVaGh2r3cKRwvzgFg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Nov 17, 2017 at 5:20 PM, Tsunakawa, Takayuki
<tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com> wrote:
> From: Kyotaro HORIGUCHI [mailto:horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp]
>> The orinal code recycles some of the to-be-removed files, but the patch
>> removes all the victims. This may impact on performance.
>
> Yes, I noticed it after submitting the patch and was wondering what to do. Thinking simply, I think it would be just enough to replace durable_unlink/durable_rename in RemoveXLogFile() with unlink/rename, and sync the pg_wal directory once in RemoveNonParentXlogFiles() and RemoveOldXlogFiles(). This will benefit the failover time when fast promotion is not performed. What do you think?
It seems not good idea to just replace durable_rename() with rename()
in RemoveOldXlogFiles()->RemoveXlogFiles()->InstallXLogFileSegment().
Because that change seems to be able to cause the following problem.
1. Checkpoint calls RemoveOldXlogFiles().
2. It recycles the WAL file AAA to BBB. pg_wal directory has not fsync'd yet.
3. Another transaction (TX1) writes its WAL data into the (recycled) file BBB.
4. CRASH and RESTART
5. The WAL file BBB disappears and you can see AAA,
but AAA is not used in recovery. This causes data loss of
transaction by Tx1.
Regards,
--
Fujii Masao
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2017-11-17 18:22:22 | Re: Treating work_mem as a shared resource (Was: Parallel Hash take II) |
Previous Message | Tom Lane | 2017-11-17 17:16:40 | Re: [HACKERS] Consistently catch errors from Python _New() functions |