Re: Backup Question

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Shaun Thomas <sthomas(at)optionshouse(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Backup Question
Date: 2013-10-22 19:27:23
Message-ID: CAMkU=1wLtXA39u17npAhFtMQjYfWMMwVEDos1e0g4aKNHXudDw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Oct 22, 2013 at 6:47 AM, Shaun Thomas <sthomas(at)optionshouse(dot)com>wrote:

> Hey everyone,
>
> This should be pretty straight-forward, but figured I'd pass it by anyway.
>
> I have a revised backup process that's coming out inconsistent, and I'm
> not entirely sure why. I call pg_start_backup(), tar.gz the contents
> elsewhere, then pg_stop_backup(). Nothing crazy. Upon restore, two of my
> tables report duplicate IDs upon executing my redaction scripts. The
> "duplicate" records ended up having different ctid's, suggesting the log
> replay was incomplete. However, nothing in the restore logs suggest this is
> the case, and either way, the database wouldn't have come up if it were.
> (right?)
>
> Now, the main difference, is that I'm doing the backup process on our
> streaming replication node. The backup process calls the pg_start_backup()
> function on the upstream provider, backs up the local content, then calls
> pg_stop_backup() on the upstream provider. In both cases, it captures the
> start/stop transaction log positions to grab all involved archived WAL
> files. I already know the start xlog position is insufficient, because
> those transaction logs may not have replayed on the standby yet, so I also
> grab 3xcheckpoint_timeout extra older files (before backup start), just in
> case.
>

That won't work. The backup_label file is promising that WAL before a
certain point has already been applied. Recovery rolls forward from this
point, and has no ability to jump backwards and roll forward from there (or
to detect that that would be needed). So you can grab the extra files, but
you can't make it apply them, as you are telling it that it doesn't need to.

You have to wait until the start-of-backup checkpoint has replayed (as a
savepoint) on the slave before you start copying the slave's contents.

See https://wiki.postgresql.org/wiki/Incrementally_Updated_Backups

Cheers,

Jeff

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Stephen Frost 2013-10-22 19:35:13 Re: Monitoring number of backends
Previous Message John R Pierce 2013-10-22 19:18:31 Re: Monitoring number of backends