| From: | Sutou Kouhei <kou(at)clear-code(dot)com> | 
|---|---|
| To: | cyberdemn(at)gmail(dot)com | 
| Cc: | smithpb2250(at)gmail(dot)com, torikoshia(at)oss(dot)nttdata(dot)com, horikyota(dot)ntt(at)gmail(dot)com, michael(at)paquier(dot)xyz, pgsql-hackers(at)postgresql(dot)org, bungina(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org | 
| Subject: | Re: pg_rewind WAL segments deletion pitfall | 
| Date: | 2024-07-12 07:24:06 | 
| Message-ID: | 20240712.162406.92904157696470075.kou@clear-code.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-bugs pgsql-hackers | 
Hi,
I'm reviewing patches in Commitfest 2024-07 from top to bottom:
https://commitfest.postgresql.org/48/
This is the 1st patch:
https://commitfest.postgresql.org/48/3874/
The latest patch can't be applied on master:
https://www.postgresql.org/message-id/CAFh8B=nNJtm9ke4_1mhpwGz2PV9yoyF6hMnYh5XACt0AA4VG-A@mail.gmail.com
I've rebased on master. See the attached patch.
Here are changes for it:
* Resolve conflict
* Update copyright year to 2024 from 2023
* Add an added test to meson.build
* Run pgindent
Here are my review comments:
@@ -217,6 +221,26 @@ findLastCheckpoint(const char *datadir, XLogRecPtr forkptr, int tliIndex,
+			char		xlogfname[MAXFNAMELEN];
+
+			tli = xlogreader->seg.ws_tli;
+			segno = xlogreader->seg.ws_segno;
+
+			snprintf(xlogfname, MAXPGPATH, XLOGDIR "/");
+			XLogFileName(xlogfname + strlen(xlogfname),
+						 xlogreader->seg.ws_tli,
+						 xlogreader->seg.ws_segno, WalSegSz);
+
+			/*
+			 * Make sure pg_rewind doesn't remove this file, because it is
+			 * required for postgres to start after rewind.
+			 */
+			insert_keepwalhash_entry(xlogfname);
MAXFNAMELEN is 64 and MAXPGPATH is 1024. strlen(XLOGDIR "/")
is 7 because XLOGDIR is "pg_wal". So xlogfname has enough
size but snprintf(xlogfname, MAXPGPATH) is wrong usage.
(And XLogFileName() uses snprintf(xlogfname, MAXFNAMELEN)
internally.)
How about using one more buffer?
----
char		xlogpath[MAXPGPATH];
char		xlogfname[MAXFNAMELEN];
tli = xlogreader->seg.ws_tli;
segno = xlogreader->seg.ws_segno;
XLogFileName(xlogfname,
			 xlogreader->seg.ws_tli,
			 xlogreader->seg.ws_segno, WalSegSz);
snprintf(xlogpath, MAXPGPATH, "%s/%s", XLOGDIR, xlogfname);
/*
 * Make sure pg_rewind doesn't remove this file, because it is
 * required for postgres to start after rewind.
 */
insert_keepwalhash_entry(xlogpath);
----
Thanks,
-- 
kou
In <CAFh8B=mDDZEsK0jDMfvP3MmxkWaeTCxW4yN42OH33JY6sQWS5Q(at)mail(dot)gmail(dot)com>
  "Re: pg_rewind WAL segments deletion pitfall" on Tue, 23 Jan 2024 09:23:29 +0100,
  Alexander Kukushkin <cyberdemn(at)gmail(dot)com> wrote:
> Hi Peter,
> 
> On Mon, 22 Jan 2024 at 00:38, Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> 
>> 2024-01 Commitfest.
>>
>> Hi, This patch has a CF status of "Ready for Committer", but it is
>> currently failing some CFbot tests [1]. Please have a look and post an
>> updated version..
>>
>> ======
>> [1]
>> https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/46/3874
>>
>>
> From what I can see all failures are not related to this patch:
> 1. Windows build failed with
> [10:52:49.679] 126/281 postgresql:recovery / recovery/019_replslot_limit
> ERROR 185.84s (exit status 255 or signal 127 SIGinvalid)
> 2. FreeBSD build failed with
> [09:11:57.656] 190/285 postgresql:psql / psql/010_tab_completion ERROR
> 0.46s exit status 2
> [09:11:57.656] 220/285 postgresql:authentication /
> authentication/001_password ERROR 0.57s exit status 2
> 
> In fact, I don't even see this patch being applied for these builds and the
> introduced TAP test being executed.
> 
> Regards,
> --
> Alexander Kukushkin
| Attachment | Content-Type | Size | 
|---|---|---|
| v9-0001-Be-more-picky-with-WAL-segment-deletion-in-pg_rew.patch | text/x-patch | 8.1 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | PG Bug reporting form | 2024-07-12 07:58:27 | BUG #18533: pg_basebackup uses out-of-bounds memory and a segment error occurs during backup | 
| Previous Message | Michael Paquier | 2024-07-12 03:42:07 | Re: Potential data loss due to race condition during logical replication slot creation | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Bertrand Drouvot | 2024-07-12 07:54:51 | Re: Restart pg_usleep when interrupted | 
| Previous Message | Richard Guo | 2024-07-12 07:11:19 | Re: Check lateral references within PHVs for memoize cache keys |