Re: [Patch] Make pg_checksums skip foreign tablespace directories

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: David Steele <david(at)pgmasters(dot)net>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, michael(dot)banck(at)credativ(dot)de, pgsql-hackers(at)postgresql(dot)org, Stephen Frost <sfrost(at)snowman(dot)net>
Subject: Re: [Patch] Make pg_checksums skip foreign tablespace directories
Date: 2020-02-20 06:55:32
Message-ID: 20200220065532.GK2288@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 19, 2020 at 12:37:00PM -0600, David Steele wrote:
> On 2/19/20 2:13 AM, Michael Paquier wrote:
>> Please note that pg_internal.init is listed within noChecksumFiles in
>> basebackup.c, so we would miss any temporary pg_internal.init.PID if
>> we don't check after the file prefix and the base backup would issue
>> extra WARNING messages, potentially masking messages that could
>> matter. So let's fix that as well.
>
> Agreed. Though, I think pg_internal.init.XX should be excluded from the
> backup as well.

Sure. That's the intention. pg_rewind, pg_checksums and basebackup.c
are all the things on my list.

> As far as I can see, the pg_internal.init.XX will not be cleaned up by
> PostgreSQL on startup. I've only tested this in 9.6 so far, but I don't see
> any differences in the code since then that would lead to better behavior.
> Perhaps that's also something we should fix?

Not sure that it is worth spending cycles on that at the beginning of
recovery as when a mapping file is written its temporary entry
truncates any existing one present matching its name.

> I'm really not a fan of a blind prefix match. I think we should stick with
> only excluding files that are created by Postgres.

Thinking more on that, excluding any backup_label with a custom suffix
worries me as it could cause a potential breakage for exiting backup
solutions. So attached is an updated patch making things in a
smarter way: I have added to the exclusion lists the possibility to
match an entry based on its prefix, or not, the choice being optional.
This solves the problem with pg_internal.PID and is careful to not
exclude unnecessary entries like suffixed backup labels or such. This
leads to some extra duplication within pg_rewind, basebackup.c and
pg_checksums but I think we can live with that, and that makes
back-patching simpler. Refactoring is still tricky though as it
relates to the use of paths across the backend and the frontend..

> So backup_label.old and
> tablespace_map.old should just be added to the exclude list. That's how we
> have it in pgBackRest.

That would be a behavior change. We could change that on HEAD, but I
don't think that this can be back-patched as this does not cause an
actual problem.

For now, my proposal is to fix the prefix first, and then let's look
at the business with tablespaces where needed.
--
Michael

Attachment Content-Type Size
exclude-prefix-filter-v2.patch text/x-diff 12.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Haumacher, Bernhard 2020-02-20 07:02:49 Re: Error on failed COMMIT
Previous Message Hubert Zhang 2020-02-20 06:39:44 Re: Print physical file path when checksum check fails