From: | "Bossart, Nathan" <bossartn(at)amazon(dot)com> |
---|---|
To: | Dipesh Pandit <dipesh(dot)pandit(at)gmail(dot)com> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, Hannu Krosing <hannuk(at)google(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: .ready and .done files considered harmful |
Date: | 2021-09-01 23:44:15 |
Message-ID: | 03826EAC-E5AE-40AB-B56B-8BEED1F5E534@amazon.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 8/25/21, 4:11 AM, "Dipesh Pandit" <dipesh(dot)pandit(at)gmail(dot)com> wrote:
> An alternate approach could be to force a directory scan at checkpoint to
> break the infinite wait for a .ready file which is being missed due to the
> fact that it is created out of order. This will make sure that the file
> gets archived within the checkpoint boundaries.
I think this is a good idea.
> Please find attached patch v11.
Thanks for the new version of the patch.
+ /*
+ * History files or a .ready file created out of order requires archiver to
+ * perform a full directory scan.
+ */
+ if (IsTLHistoryFileName(xlog) || IsBackupHistoryFileName(xlog) ||
+ fileOutOfOrder)
+ PgArchEnableDirScan();
I think we should force a directory scan for everything that isn't a
regular WAL file. IOW we can use !IsXLogFileName(xlog) instead of
enumerating all the different kinds of files we might want to archive.
+ /*
+ * Segment number of the most recent .ready file found by archiver,
+ * protected by WALArchiveLock.
+ */
+ XLogSegNo lastReadySegNo;
} PgArchData;
+/*
+ * Segment number and timeline ID to identify the next file in a WAL sequence
+ */
+typedef struct readyXLogState
+{
+ XLogSegNo lastSegNo;
+ TimeLineID lastTLI;
+} readyXLogState;
lastSegNo and lastReadySegNo appear to be the same thing. Couldn't we
just use the value in PgArchData?
+ return (curSegNo < lastSegNo) ? true : false;
I think this needs to be <=. If the two values are equal,
pgarch_readyXlog() may have just completed a directory scan and might
be just about to set PgArch->lastSegNo to a greater value.
+ LWLockAcquire(WALArchiveLock, LW_EXCLUSIVE);
+ PgArch->lastReadySegNo = segNo;
+ LWLockRelease(WALArchiveLock);
IMO we should just use a spinlock instead of introducing a new LWLock.
It looks like you really only need the lock for a couple of simple
functions. I still think protecting PgArch->dirScan with a spinlock
is a good idea, if for no other reason than it makes it easier to
reason about this logic.
+ if (stat(xlogready, &st) == 0)
I think we should ERROR if stat() fails for any other reason than
ENOENT.
+ ishistory = IsTLHistoryFileName(basename) ||
+ IsBackupHistoryFileName(basename);
I suspect we still want to prioritize timeline history files over
backup history files. TBH I find the logic below this point for
prioritizing history files to be difficult to follow, and I think we
should refactor it into some kind of archive priority comparator
function.
+ /*
+ * Reset the flag only when we found a regular WAL file to make
+ * sure that we are done with processing history files.
+ */
+ PgArch->dirScan = false;
I think we really want to unset dirScan before we start the directory
scan, and then we set it to true afterwards if we didn't find a
regular WAL file. If someone requests a directory scan in the middle
of an ongoing directory scan, we don't want to lose that request.
I attached two patches that demonstrate what I'm thinking this change
should look like. One is my take on the keep-trying-the-next-file
approach, and the other is a new version of the multiple-files-per-
readdir approach (with handling for "cheating" archive commands). I
personally feel that the multiple-files-per-readdir approach winds up
being a bit cleaner and more resilient than the keep-trying-the-next-
file approach. However, the keep-trying-the-next-file approach will
certainly be more efficient (especially for the extreme cases
discussed in this thread), and I don't have any concrete concerns with
this approach that seem impossible to handle.
Nathan
Attachment | Content-Type | Size |
---|---|---|
0001-keep-trying-the-next-file-approach.patch | application/octet-stream | 10.6 KB |
0001-Improve-performance-of-pgarch_readyXlog-with-many-st.patch | application/octet-stream | 10.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2021-09-01 23:56:53 | Re: Postgres Win32 build broken? |
Previous Message | Jaime Casanova | 2021-09-01 23:39:17 | Re: Numeric x^y for negative x |