Re: Allow pg_archivecleanup to remove backup history files

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Allow pg_archivecleanup to remove backup history files
Date: 2023-05-15 06:16:46
Message-ID: e02ae6b16a32bdd2f4856e9b7f8a6439@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2023-05-15 09:18, Michael Paquier wrote:
> On Fri, May 12, 2023 at 05:53:45PM +0900, torikoshia wrote:
>> On 2023-05-10 17:52, Bharath Rupireddy wrote:
>> I was a little concerned about what to do when deleting both the files
>> ending in .gz and backup history files.
>> Is making it possible to specify both "-x .backup" and "-x .gz" the
>> way to
>> go?
>>
>> I also concerned someone might add ".backup" to WAL files, but does
>> that
>> usually not happen?
>
> Depends on the archive command, of course. I have seen code using
> this suffix for some segment names in the past, FWIW.

Thanks for the information.
I'm going to stop adding special logic for "-x .backup" and add a new
option for removing backup history files.

>>> Comments on the patch:
>>> 1. Why just only the backup history files? Why not remove the
>>> timeline
>>> history files too? Is it because there may not be as many tli
>>> switches
>>> happening as backups?
>>
>> Yeah, do you think we should also add logic for '-x .history'?
>
> Timeline history files can be critical pieces when it comes to
> assigning a TLI as these could be retrieved by a restore_command
> during recovery for a TLI jump or just assign a new TLI number at the
> end of recovery, still you could presumably remove the TLI history
> files that are older than the TLI the segment defined refers too.
> (pg_archivecleanup has no specific logic to look at the history with
> child TLIs for example, to keep it simple, and I'd rather keep it this
> way). There may be an argument for making that optional, of course,
> but it does not strike me as really necessary compared to the backup
> history files which are just around for debugging purposes.

Agreed.

>>> 2.+sub remove_backuphistoryfile_run_check
>>> +{
>>> Why to invent a new function when run_check() can be made generic
>>> with
>>> few arguments passed?
>>
>> Thanks, I'm going to reconsider it.
>
> + <para>
> + Remove files including backup history files, whose suffix is
> <filename>.backup</filename>.
> + Note that when <replaceable>oldestkeptwalfile</replaceable>
> is a backup history file,
> + specified file is kept and only preceding WAL files and
> backup history files are removed.
> + </para>
>
> This addition to the documentation looks unprecise to me. Backup
> history files have a more complex format than just the .backup
> suffix, and this is documented in backup.sgml.

I'm going to remove the explanation for the backup history files and
just add the hyperlink to the original explanation in backup.sgml.

> How about plugging in some long options, and use something more
> explicit like --clean-backup-history?

Agreed.

>
> - if ((IsXLogFileName(walfile) || IsPartialXLogFileName(walfile)) &&
> + if (((IsXLogFileName(walfile) || IsPartialXLogFileName(walfile)) ||
> + (removeBackupHistoryFile &&
> IsBackupHistoryFileName(walfile))) &&
> strcmp(walfile + 8, exclusiveCleanupFileName + 8) < 0)
>
> Could it be a bit cleaner to split this check in two, saving one level
> of indentation on the way for its most inner loop? I would imagine
> something like:
> /* Check file name */
> if (!IsXLogFileName(walfile) &&
> !IsPartialXLogFileName(walfile))
> continue;
> /* Check cutoff point */
> if (strcmp(walfile + 8, exclusiveCleanupFileName + 8) >= 0)
> continue;
> //rest of the code doing the unlinks.
> --
Thanks, that looks better.

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-05-15 06:22:41 Re: Allow pg_archivecleanup to remove backup history files
Previous Message Kirk Wolak 2023-05-15 06:00:56 psql: Could we get "-- " prefixing on the **** QUERY **** outputs? (ECHO_HIDDEN)