Re: The backup API and general purpose backup software

From: Ron <ronljohnsonjr(at)gmail(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: The backup API and general purpose backup software
Date: 2020-06-21 15:32:16
Message-ID: a48cc2eb-2407-95ef-ac56-aa26518ae4d2@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Peter,

I don't understand that last step "5. copy the result from the previous step
to the backup medium."  It seems to be a duplication of "3. copy the
contents of the data directory to the backup medium".

On 6/21/20 8:28 AM, Peter J. Holzer wrote:
> This is inspired by the thread with the subject "Something else about
> Redo Logs disappearing", but since that thread is already quite long,
> since I have lost track what exactly "Peter"'s problem is and since his
> somewhat belligerent tone makes it unappealing to reread the whole
> thread I'll attempt a fresh start.
>
> To make a full backup with the "new" (non-exclusive) API, a software
> must do the following
>
> 1. open a connection to the database
>
> 2. invoke pg_start_backup('label', false, false) in the connection from
> step 1.
>
> 3. copy the contents of the data directory to the backup medium
>
> 4. invoke pg_stop_backup(false, true) in the connection from step 1.
>
> 5. copy the result from the previous step to the backup medium.
>
> (It is assumed that any archived WALs are also archived in a manner that
> they can be restored together with this backup. If this is not the case
> adding them to the backup would be step 6.)
>
> So far so good and writing a program which implements this should not
> pose a great difficulty.
>
> General purpose backup software often assumes that it can perform a
> backup in three steps:
>
> 1. Invoke a pre-backup script.
>
> 2. Copy files to the backup medium in such a way that they can be
> identified as a group and be restored together and without mixing
> them with data from another backup session.
>
> 3, Invoke a post-backup script.
>
> (I have to admit that is has been a long time since I've looked at any
> backup system in any detail. Doubtlessly many have a more complicated
> model, but I'm fairly confident that this is still the lowest common
> denominator.)
>
> Now we have two problems.
>
> The first is that the pre-backup script has to exit before the proper
> backup can begin, but it also has to open a connection which will stay
> open during the backup because it will be needed in the post-backup
> script again. This can be solved by starting a background process which
> keeps the connection open and providing some communication channel
> (perhaps a Unix pipe) for the post-backup script to communicate with
> this background process. A bit awkward, but no great deal.
>
> The second problem is that the post-backup script will only be called
> after the backup is finished. So the information returned by pg_stop_backup
> cannot be included in the backup which makes the backup useless. This is
> indeed serious, and I don't see a way around this in this simple model.
>
> But there is a workaround: Split the backup.
>
> I am assuming that the backup software uses a unique ID to identify each
> backup and passes that ID to the pre-backup and post-backup script.
> This ID is used as the label in the call to pg_start_backup().
> It is also included in the information returned by pg_stop_backup(). So
> the post-backup script stores that information in a different place
> (maybe including the ID in the filename(s) to avoid conflicts and for
> redundancy) and then triggers a backup of that place (or alternatively
> that can be backed up independently).
>
> To restore a backup, you will then also need two steps:
>
> 1) Restore the summary information from the second backup. Inspect the
> backup_label to find the ID of the backup of the data.
>
> 2) Restore that backup
>
> (3) Put the backup label where it belongs, make sure the archived WALs
> are accessible and start the database
>
> hp
>
>
>

--
Angular momentum makes the world go 'round.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Peter J. Holzer 2020-06-21 15:45:20 Re: The backup API and general purpose backup software
Previous Message Vishal Agrawal 2020-06-21 14:53:37 Re: Unable to init and run postgresql-12 on centos 8.2.2004