From: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, David Steele <david(at)pgmasters(dot)net>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Refactor backup related code (was: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?) |
Date: | 2022-09-20 11:43:49 |
Message-ID: | CALj2ACWo57XEkJBnW+Ow=VBRPDuzt1ws8jbtZhswDbmrkQvOqA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Sep 20, 2022 at 7:20 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> The main regression test suite should not include direct calls to
> pg_backup_start() or pg_backup_stop() as these depend on wal_level,
> and we spend a certain amount of resources in keeping the tests a
> maximum portable across such configurations, wal_level = minimal being
> one of them. One portable example is in 001_stream_rep.pl.
Understood.
> > That's a good idea. I'm marking a flag if the label name overflows (>
> > MAXPGPATH), later using it in do_pg_backup_start() to error out. We
> > could've thrown error directly, but that changes the behaviour - right
> > now, first "
> > wal_level must be set to \"replica\" or \"logical\" at server start."
> > gets emitted and then label name overflow error - I don't want to do
> > that.
>
> - if (strlen(backupidstr) > MAXPGPATH)
> + if (state->name_overflowed == true)
> ereport(ERROR,
> (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> errmsg("backup label too long (max %d bytes)",
> It does not strike me as a huge issue to force a truncation of such
> backup label names. 1024 is large enough for basically all users,
> in my opinion. Or you could just truncate in the internal logic, but
> still complain at MAXPGPATH - 1 as the last byte would be for the zero
> termination. In short, there is no need to complicate things with
> name_overflowed.
We currently allow MAXPGPATH bytes of label name excluding null
termination. I don't want to change this behaviour. In the attached v9
patch, I'm truncating the larger names to MAXPGPATH + 1 bytes in
backup state (one extra byte for representing that the name has
overflown, and another extra byte for null termination).
> + StringInfo backup_label; /* backup_label file contents */
> + StringInfo tablespace_map; /* tablespace_map file contents */
> + StringInfo history_file; /* history file contents */
> IMV, repeating a point I already made once upthread, BackupState
> should hold none of these. Let's just generate the contents of these
> files in the contexts where they are needed, making BackupState
> something to rely on to build them in the code paths where they are
> necessary. This is going to make the reasoning around the memory
> contexts where each one of them is stored much easier and reduce the
> changes of bugs in the long-term.
I've separated out these variables from the backup state, please see
the attached v9 patch.
> > I've also taken help of the error callback mechanism to clean up the
> > allocated memory in case of a failure. For do_pg_abort_backup() cases,
> > I think we don't need to clean the memory because that gets called on
> > proc exit (before_shmem_exit()).
>
> Memory could still bloat while the process running the SQL functions
> is running depending on the error code path, anyway.
I didn't get your point. Can you please elaborate it? I think adding
error callbacks at the right places would free up the memory for us.
Please note that we already are causing memory leaks on HEAD today.
I addressed the above review comments. I also changed a wrong comment
[1] that lies before pg_backup_start() even after the removal of
exclusive backup.
I'm attaching v9 patch set herewith, 0001 - refactors the backup code
with backup state, 0002 - adds error callbacks to clean up the memory
allocated for backup variables. Please review them further.
[1]
* Essentially what this does is to create a backup label file in $PGDATA,
* where it will be archived as part of the backup dump. The label file
* contains the user-supplied label string (typically this would be used
* to tell where the backup dump will be stored) and the starting time and
* starting WAL location for the dump.
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment | Content-Type | Size |
---|---|---|
v9-0001-Refactor-backup-related-code.patch | application/x-patch | 34.7 KB |
v9-0002-Add-error-callbacks-for-deallocating-backup-relat.patch | application/x-patch | 8.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2022-09-20 12:00:58 | Re: why can't a table be part of the same publication as its schema |
Previous Message | Bharath Rupireddy | 2022-09-20 11:43:17 | Re: binary version of pg_current_wal_insert_lsn and pg_walfile_name functions |