Re: Extending BASE_BACKUP in replication protocol: incremental backup and backup format

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Extending BASE_BACKUP in replication protocol: incremental backup and backup format
Date: 2014-01-14 13:41:48
Message-ID: CABUevEzw3rUR7oV8nhpMSwGivtrdME2QbmV+zkrKEjVOyh=bjw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 14, 2014 at 2:16 PM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com>wrote:

> On Tue, Jan 14, 2014 at 10:01 PM, Heikki Linnakangas
> <hlinnakangas(at)vmware(dot)com> wrote:
> >> - Addition of an option called INCREMENTAL to send an incremental
> >> backup to the client. This option uses as input an LSN, and sends back
> >> to client relation pages (in the shape of reduced relation files) that
> >> are newer than the LSN specified by looking at pd_lsn of
> >> PageHeaderData. In this case the LSN needs to be determined by client
> >> based on the latest full backup taken. This option is particularly
> >> interesting to reduce the amount of data taken between two backups,
> >> even if it increases the restore time as client needs to reconstitute
> >> a base backup depending on the recovery target and the pages modified.
> >> Client would be in charge of rebuilding pages from incremental backup
> >> by scanning all the blocks that need to be updated based on the full
> >> backup as the LSN from which incremental backup is taken is known. But
> >> this is not really something the server cares about... Such things are
> >> actually done by pg_rman as well.
> >
> >
> > How does the server find all the pages with LSN > the threshold? If it
> needs
> > to scan the whole database, it's not all that useful. I guess it would be
> > better than nothing, but I think you might as well just use rsync.
> Yes, it would be necessary to scan the whole database as the LSN to be
> checked is kept in PageHeaderData :). Perhaps it is not that
> performant, but my initial thought was that perhaps the amount of data
> necessary to maintain incremental backups could balance with the
> amount of WAL necessary to keep and limit the whole amount on disk.
>

It wouldn't be worse performance wise than a full backup. That one also has
to read all the blocks after all... You're decreasing network traffic and
client storage, with the same I/O on the server side. Seems worthwhile.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2014-01-14 13:42:36 Re: Extending BASE_BACKUP in replication protocol: incremental backup and backup format
Previous Message Andres Freund 2014-01-14 13:41:37 Re: Extending BASE_BACKUP in replication protocol: incremental backup and backup format