Re: Updated backup APIs for non-exclusive backups

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Noah Misch <noah(at)leadboat(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Marco Nenciarini <marco(dot)nenciarini(at)2ndquadrant(dot)it>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Updated backup APIs for non-exclusive backups
Date: 2016-04-20 05:12:59
Message-ID: CAHGQGwHUkEbkVexVfWNLjmq2rzOS_SHYMiECt+KBn-cBPq5Arg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Apr 17, 2016 at 1:22 AM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> On Wed, Apr 13, 2016 at 4:07 AM, Noah Misch <noah(at)leadboat(dot)com> wrote:
>>
>> On Tue, Apr 12, 2016 at 10:08:23PM +0200, Magnus Hagander wrote:
>> > On Tue, Apr 12, 2016 at 8:39 AM, Noah Misch <noah(at)leadboat(dot)com> wrote:
>> > > On Mon, Apr 11, 2016 at 11:22:27AM +0200, Magnus Hagander wrote:
>> > > > Well, if we *don't* do the rewrite before we release it, then we
>> > > > have to
>> > > > instead put information about the new version of the functions into
>> > > > the
>> > > old
>> > > > structure I think.
>> > > >
>> > > > So I think it's an open issue.
>> > >
>> > > Works for me...
>> > >
>> > > [This is a generic notification.]
>> > >
>> > > The above-described topic is currently a PostgreSQL 9.6 open item.
>> > > Magnus,
>> > > since you committed the patch believed to have created it, you own
>> > > this
>> > > open
>> > > item. If that responsibility lies elsewhere, please let us know whose
>> > > responsibility it is to fix this. Since new open items may be
>> > > discovered
>> > > at
>> > > any time and I want to plan to have them all fixed well in advance of
>> > > the
>> > > ship
>> > > date, I will appreciate your efforts toward speedy resolution. Please
>> > > present, within 72 hours, a plan to fix the defect within seven days
>> > > of
>> > > this
>> > > message. Thanks.
>> > >
>> >
>> > I won't have time to do the bigger rewrite/reordeirng by then, but I can
>> > certainly commit to having the smaller updates done to cover the new
>> > functionality in less than a week. If nothing else, that'll be something
>> > for me to do on the flight over to pgconf.us.
>>
>> Thanks for that plan; it sounds good.
>
>
> Here's a suggested patch.
>
> There is some duplication between the non-exclusive and exclusive backup
> sections, but I wanted to make sure that each set of instructions can just
> be followed top-to-bottom.
>
> I've also removed some tips that aren't really necessary as part of the
> step-by-step instructions in order to keep things from exploding in size.
>
> Finally, I've changed references to "backup dump" to just be "backup",
> because it's confusing to call them something with dumps in when it's not
> pg_dump. Enough that I got partially confused myself while editing...
>
> Comments?

+ Low level base backups can be made in a non-exclusive or an exclusive
+ way. The non-exclusive method is recommended and the exclusive one will
+ at some point be deprecated and removed.

I don't object to add a non-exclusive mode of low level backup,
but I disagree to mark an exclusive backup as deprecated at least
until we can alleviate some pains that a non-exclusive mode causes.

One example of the pain, in a non-exclusive backup, we need to keep
the IDLE connection which was used to execute pg_start_backup(),
until the end of backup. Of course a backup can take a very
long time. In this case the IDLE connection also needs to remain
for such a long time. If it's accidentally terminated (e.g., because
of IDLE connection), the backup fails and needs to be taken again
from the beginning.

Another pain in a non-exclusive backup is to have to execute both
pg_start_backup() and pg_stop_backup() on the same connection.
Please imagine the case where psql is used to execute those two
backup functions (I believe that there are many users who do this).
For example,

psql -c "SELECT pg_start_backup()"
rsync, cp, tar, storage backup, or something
psql -c "SELECT pg_stop_backup()"

A non-exclusive backup breaks the above very simple steps because
two backup functions are executed on different connections.
So, how should we modify the steps for a non-exclusive backup?
Basically we need to pause psql after pg_start_backup(), signal it
to resume after the copy of database cluster is taken, and make
it execute pg_stop_backup(). I'm afraid that the backup script
will be complicated because of this pain of non-exclusive backup.

+ The <function>pg_stop_backup</> will return one row with three
+ values. The second of these fields should be written to a file named
+ <filename>backup_label</> in the root directory of the backup. The
+ third field should be written to a file named
+ <filename>tablespace_map</> unless the field is empty.

How should we write those two values to different files when
we execute pg_stop_backup() via psql? Whole output of
pg_stop_backup() should be written to a transient file and
it should be filtered and written to different two files by
using some Linux commands? This also seems to make the backup
script more complicated.

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2016-04-20 05:45:25 Re: [HACKERS] Re: pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <
Previous Message Michael Paquier 2016-04-20 05:03:16 Re: VS 2015 support in src/tools/msvc