Re: Backup Question

From: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "Shaun Thomas *EXTERN*" <sthomas(at)optionshouse(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Backup Question
Date: 2013-10-22 15:11:22
Message-ID: A737B7A37273E048B164557ADEF4A58B17C5066C@ntex2010i.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Shaun Thomas wrote:

> I have a revised backup process that's coming out inconsistent, and I'm not entirely sure why. I call
> pg_start_backup(), tar.gz the contents elsewhere, then pg_stop_backup(). Nothing crazy. Upon restore,
> two of my tables report duplicate IDs upon executing my redaction scripts. The "duplicate" records
> ended up having different ctid's, suggesting the log replay was incomplete. However, nothing in the
> restore logs suggest this is the case, and either way, the database wouldn't have come up if it were.
> (right?)

Wrong. The database cannot check all data for consistency
upon backup. For one, that would take way too long.

> Now, the main difference, is that I'm doing the backup process on our streaming replication node. The
> backup process calls the pg_start_backup() function on the upstream provider, backs up the local
> content, then calls pg_stop_backup() on the upstream provider. In both cases, it captures the
> start/stop transaction log positions to grab all involved archived WAL files. I already know the start
> xlog position is insufficient, because those transaction logs may not have replayed on the standby
> yet, so I also grab 3xcheckpoint_timeout extra older files (before backup start), just in case.
>
> So, I get no complaints of missing or damaged archive log files. Yet the restore is invalid. I checked
> the upstream, and those duplicate rows are not present; it's clearly the backup that's at fault. I
> remember having this problem a couple years ago, but I "fixed" it by working filesystem snapshots into
> the backup script. I can do that again, but it seems like overkill, honestly.

If you backup the standby, then you won't have a backup_label file.
You cannot restore a backup without that.

Moreover, recovery needs a checkpoint/restartpoint to start.
Restartpoints on the standby won't be the same as checkpoints
on the primary, so I believe that even with the backup_label
file you would not be able to restore the data.

I'm not sure about the second point, maybe somebody can confirm or refute that.

Yours,
Laurenz Albe

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Merlin Moncure 2013-10-22 15:22:20 Re: Count of records in a row
Previous Message Anson Abraham 2013-10-22 15:08:18 Re: streaming replication: could not receive data from client: Connection reset by peer