From: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
---|---|
To: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
Cc: | PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Extending BASE_BACKUP in replication protocol: incremental backup and backup format |
Date: | 2014-01-14 13:01:29 |
Message-ID: | 52D53529.8010809@vmware.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 01/14/2014 02:47 PM, Michael Paquier wrote:
> I would like to propose the following things to extend BASE_BACKUP to
> retrieve a backup from a stream:
> - Addition of an option FORMAT, to control the output format of
> backup, with possible options as 'plain' and 'tar'. Default is tar for
> backward compatibility purposes. The purpose of this option is to make
> easier for backup tools playing with postgres to retrieve and backup
> and analyze it on the fly, the purpose being to filter and analyze the
> data while it is being received without all the tar decoding
> necessary, what would consist in copying portions of pg_basebackup
> code more or less.
Umm, you have to somehow mark in the protocol where one file begins and
another one ends. The 'tar' format seems perfectly OK for that purpose.
What exactly would the 'plain' format do?
> - Addition of an option called INCREMENTAL to send an incremental
> backup to the client. This option uses as input an LSN, and sends back
> to client relation pages (in the shape of reduced relation files) that
> are newer than the LSN specified by looking at pd_lsn of
> PageHeaderData. In this case the LSN needs to be determined by client
> based on the latest full backup taken. This option is particularly
> interesting to reduce the amount of data taken between two backups,
> even if it increases the restore time as client needs to reconstitute
> a base backup depending on the recovery target and the pages modified.
> Client would be in charge of rebuilding pages from incremental backup
> by scanning all the blocks that need to be updated based on the full
> backup as the LSN from which incremental backup is taken is known. But
> this is not really something the server cares about... Such things are
> actually done by pg_rman as well.
How does the server find all the pages with LSN > the threshold? If it
needs to scan the whole database, it's not all that useful. I guess it
would be better than nothing, but I think you might as well just use rsync.
- Heikki
From | Date | Subject | |
---|---|---|---|
Next Message | Jan Kara | 2014-01-14 13:07:44 | Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance |
Previous Message | Andres Freund | 2014-01-14 12:58:49 | Re: Extending BASE_BACKUP in replication protocol: incremental backup and backup format |