incremental backups

From: Rick Gigger <rick(at)alpinenetworking(dot)com>
To: pgsql general <pgsql-general(at)postgresql(dot)org>
Subject: incremental backups
Date: 2006-01-26 17:33:23
Message-ID: 9282E52E-4127-4B2F-925E-11331B5429D7@alpinenetworking.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I am looking into using WAL archiving for incremental backups. It
all seems fairly straightforward except for one thing.

So you set up the archiving of the WAL files. Then you set up cron
or something to regularly do a physical backup of the data
directory. But when you do the physical backup you don't have the
last WAL file archived yet that you need to restore that physical
backup. So you always need to keep at least two physical backups
around so that you know that at least one of them has the WAL files
needed for recovery.

The question I have is: how do I know if I can use the latest one?
That is if I first do physical backup A and then later do physical
backup B and then I want to do a restore. How do I know when I've
got the files I need to use B so that I don't have to go all the way
back to A?

My initial thoughts are that I could:

a) just before or after calling pg_stop_backup check the file system
to see what the last archived WAL file is on disk and make sure that
that I get the next one before I try restoring from that backup.

b) just before or after calling pg_stop_backup check postgres to see
to see what the current active WAL file is and make sure it has been
archived before I try to restore from that backup.

c) Always just use backup A.

No c seems the easiest but is that even fail safe? I realize it
wouldn't really ever happen in an active production environment that
was set up right but say you did backup A and backup B and during
that whole time you had few writes in postgres that you never filled
up a whole WAL file so both of the backups are invalid. Then you
would have to always go to option a or b above to verify that a given
backup was good so that any previous backups could be deleted.

Wouldn't it make things a lot easier if the backup history file not
only gave you the name of the first file that you need but also the
last one? Then you could look at a given backup and say I need this
start file and this end file. Then you could delete all archived WAL
files before start file. And you could delete any old physical dumps
because you know that your last physical dump was good. It would
just save you the step in the backups process of figuring out what
that file is. And it seems like pg_stop_backup could determine that
on it's own.

Does that make sense? Am I totally off base here?

Rick

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jim Buttafuoco 2006-01-26 17:36:07 Re: Access Problem After Version Upgrade -- Update
Previous Message Rich Shepard 2006-01-26 17:33:01 Re: Access Problem After Version Upgrade