From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | David Steele <david(at)pgbackrest(dot)org> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Return pg_control from pg_backup_stop(). |
Date: | 2024-10-03 04:45:43 |
Message-ID: | Zv4hdzZURFAYYAL5@paquier.xyz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Oct 02, 2024 at 09:03:27AM +0000, David Steele wrote:
> On 10/2/24 10:11, Michael Paquier wrote:
>> Hmm, okay. There is also a slight impact for BASE_BACKUP, requiring
>> basebackup_progress_wait_wal_archive() and do_pg_backup_stop() to be
>> called earlier when sending the main data directory which is the last
>> one in the list of tablespaces. As far as I can see, this does not
>> change the logic because do_pg_backup_stop() does not touch the
>> control file, but it seems to me that we'd rather keep these two
>> calls as they are now, and send the control file once we are out of
>> the for loop that processes all the tablespaces. That seems slightly
>> cleaner to me, still I agree that both are the same things.
>
> Sending pg_control later results in even more code churn because of how tar
> files are terminated. I've updated it that way in v2 so you can see what I
> mean. I don't have a strong preference, though, so if you prefer the
> implementation in v2 then that's fine with me.
It does not make much of a difference, indeed.
>> Anyway, finishing do_pg_backup_stop() and then sending the control
>> file is just a consequence of the implementation choice driven by the
>> output required for the SQL function so as this is stored in the
>> backup state to get it down to pg_backup_stop() in xlogfuncs.c, so we
>> could also take one step back and forget about the SQL function,
>> setting only the new flag when sending a BASE_BACKUP, or just not use
>> the backupState to store this data as the exercise involves forcing
>> one boolean and recalculate a CRC32.
>
> I can definitely see us making other updates to pg_control so I would rather
> keep this logic centralized, even though it is not too complicated at this
> point. Still, even 8 lines of code (as it is now) seems better not to
> duplicate.
I was wondering if the field update should be hidden behind a macro
that uses an offsetof() on ControlFileData, with the name of the field
and a pointer to the value to update to. If you include the CRC32
calculation in that, that makes for less chunks of code when updating
one field of the control file.
The full CRC calculation could also be hidden inside a macro, as there
are a couple of places where we do the same things, like pg_rewind.c,
etc.
>> We're talking about a 8kB file which has a size of 512B
>> (PG_CONTROL_MAX_SAFE_SIZE) to avoid such issues. So I'm not sure to
>> see your point here?
>
> Even at 512B it is possible to see tears in pg_control and they happen in
> the build farm right now. In fact, this thread [1] trying to fix the problem
> was what got me thinking about alternate solutions to preventing tears in
> pg_control. Thomas' proposed fixes have not been committed to my knowledge
> so the problem remains, but would be fixed by this commit.
Ah, right. That rings a bell. Thomas has done some work with
c558e6fd92ff and 63a582222c6b. And we're still not taking the
ControlFileLock while copying it over.. It looks like we should do it
separately, and backpatch. That's not something for this thread to
worry about.
>> Perhaps existing
>> backup solutions are good enough risk vs reward is not worth it?
>
> I'm not sure I see the risk here. Saving out pg_control is optional so no
> changes to current software is required. Of course they miss the benefit of
> the protection against tears and missing backup_label, but that is a choice.
>
> Again, optional, but if I was able to manage these saves using the psql
> interface in the TAP tests then I'd say it would be pretty easy for anyone
> with a normal connection to Postgres. Also, we require users to treat
> tabelspace_map and backup_label as binary so not too big a change here.
Maintenance cost for a limited user impact overall. With incremental
backups being a thing in v18 only available through the replication
protocol, the SQL functions have less advantages these days. My point
would be to see this thread as a two-step process:
1) Update the flag in the control file when sending it across in
replication stream.
2) Do the SQL function thing with the bytea for the control file, if
necessary.
1) is something that has more value than 2), IMO, because there is no
need for a manual step when a backup is taken by the replication
protocol. Well, custom backup solutions that rely on the replication
protocol to copy the data would need to make sure that they have a
backup_label, but that's something they should do anyway and what this
patch wants to protect users from. The SQL part is optional IMO. It
can be done, but it has less impact overall and makes backups more
complicated by requiring the manual copy of the control file.
--
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | Masahiko Sawada | 2024-10-03 04:47:07 | Re: Using per-transaction memory contexts for storing decoded tuples |
Previous Message | Hayato Kuroda (Fujitsu) | 2024-10-03 04:42:05 | RE: Using per-transaction memory contexts for storing decoded tuples |