Re: Basic question on recovery and disk snapshotting

From: Yang Zhang <yanghatespam(at)gmail(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Jov <amutu(at)amutu(dot)com>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Basic question on recovery and disk snapshotting
Date: 2013-04-27 20:10:24
Message-ID: CAKxBDU92PF3MD8CTJE=GwXmpMEbBVpSbML48iTHz_6s7kF5_XA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sat, Apr 27, 2013 at 11:55 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> On Sat, Apr 27, 2013 at 10:40 AM, Yang Zhang <yanghatespam(at)gmail(dot)com> wrote:
>> My question really boils down to: if we're interested in using COW
>> snapshotting (a common feature of modern filesystems and hosting
>> environments), would we necessarily need to ensure the data and
>> pg_xlog are on the same snapshotted volume?
>
>
> That would certainly make it easier. But it shouldn't be necessary, as long
> as the xlog snapshot is taken after the cluster snapshot, and also as long
> as no xlog files which were written to after the last completed checkpoint
> prior to the cluster snapshot got recycled before the xlog snapshot. As
> long as the snapshots run quickly and promptly one after the other, this
> should not be a problem, but you should certainly validate that a snapshot
> collection has all the xlogs it needs before accepting it as being good. If
> you find some necessary xlog files are missing, you can turn up
> wal_keep_segments and try again.

This information is gold, thank you.

How do I validate that a snapshot collection has all the xlogs it needs?

>
>
>>
>> If not, how should we be
>> taking the snapshots - should we be using pg_start_backup() and then
>> taking the snapshot of one before the other? (What order?) What if
>> we have tablespaces, do we take snapshots of those, followed by the
>> cluster directory, followed by pg_xlog?
>
>
> First the cluster directory (where "pg_control" is), then tablespaces, then
> pg_xlog. pg_start_backup() shouldn't be necessary, unless you are running
> with full_page_writes off. But it won't hurt, and if you don't use
> pg_start_backup you should probably run a checkpoint of your own immediately
> before starting.
>
>>
>> I read through
>> http://www.postgresql.org/docs/9.1/static/continuous-archiving.html
>> and it doesn't touch on these questions.
>
>
> Your goal seems to be to *avoid* continuous archiving, so I wouldn't expect
> that part of the docs to touch on your issues. But see the section
> "Standalone Hot Backups" which would allow you to use snapshots for the
> cluster "copy" part, and normal archiving for just the xlogs. The volume of
> pg_xlog should be fairly small, so this seems to me like an attractive
> option.

Just to validate my understanding, are the two options as follows?

a. Checkpoint (optional but helps with time window?), snapshot
tablespaces/cluster/xlog, validate all necessary xlogs present.

b. Set wal_level/archive_mode/archive_command, pg_start_backup,
snapshot tablespaces/cluster, pg_stop_backup to archive xlog.

(a) sounds more appealing since it's treating recovery as crash
recovery rather than backup restore, and as such seems simpler and
lower-overhead (e.g. WAL verbosity, though I don't know how much that
overhead is). However, I'm not sure how complex that validation step
is.

>
> If you really don't want to use archiving, even just during the duration of
> the cluster snapshotting, then this is the part that addresses your
> questions:
>
> http://www.postgresql.org/docs/9.1/static/backup-file.html

I'm still interested in online backups, though - stopping the DB is a
no-go unfortunately.

--
Yang Zhang
http://yz.mit.edu/

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2013-04-28 03:03:26 Re: DISTINCT ON changes sort order on its own it seems
Previous Message Jeff Janes 2013-04-27 18:55:05 Re: Basic question on recovery and disk snapshotting