From: | Jerry Sievers <gsievers19(at)comcast(dot)net> |
---|---|
To: | Stephen Frost <sfrost(at)snowman(dot)net> |
Cc: | pggeneral <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Feedback about hybrid SAN snap and rsync'd approach for large systemcloning |
Date: | 2022-04-19 02:35:00 |
Message-ID: | A1B67D24-5721-4597-8F54-9217C77028A0@comcast.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi Stephen, and thanks! Please see below...
> Greetings,
> * Jerry Sievers (gsievers19(at)comcast(dot)net) wrote:
> Suppose we have a DB cluster with an additional tablespace and we are
> able to make an atomic SAN snapshot of *only* the main cluster
> volume...
> The additional tablespace contains only UNLOGGED relations.
> We cannot snap that volume so we use rsync as follows...
> 1. pg_start_backup('foo');
> make SAN snapshot
> rsync the add'l tablespace
> pg_stop_backup()
> Surely you're also doing WAL archiving? You shouldn't ever be using
> start/stop backup without also doing WAL archiving as you must capture
> all of the WAL between the two.
Oh, we're doing archiving all right, via pg_receivewal.
And please stop calling me Shirley :-)
> Now provision a new cluster around the snapshot and rsync'd volume,
> rejigger the pg_tblspc link if necessary... and start it up maybe or
> not having it remain as a streaming replica.
> Dedicated backup tools know how to do tablespace remapping.
Yep, I'm aware. The fixing of symlinks isn't much of a concern here,
generally.
> It's been my experience that possibly bulky data in the additional
> tablespace does *not* need be rsync'd if we capture only the *_init
> files.
> That isn't a trivial thing to do with rsync tho and "try it and see if
> it works" is a really bad way to go about this- it's important to
> understand what the files are in the data directory, how PG uses them,
> how PG recovery works, etc. That's why there are folks who write
> dedicated backup tools for PG (myself included).
Understood. It's a legacy configuration that we're working with
that's used both for making full backups as well as frequently
refreshed non-prod systems. Some of them are large enough to make
compelling avoidance of any unnecessary data materialization.
> Id' be curious to here feedback re the sanity of this approach.
> I wouldn't recommend it is the short answer. Note that you must be
> doing WAL archiving and you really should (and will have to, as of 15)
Hmmm, we're using non-exclusive just as of my most recent
updates... but I wasn't aware that exclusive mode was going away
entirely w/V15.
> use non-exclusive backup when you're doing this and be sure to copy the
> backup_label file from the pg_stop_backup/pg_backup_stop (in v15)
> results and put that into the snapshot. If you use exclusive mode
> then if the system crashes for any reason during the rsync then you'll
> have a good chance of ending up with a system that won't come back up
> until you go muck around in the data directory. It's not good and is
> why it's now been removed.
Agree and I've certainly had a few cases of that in the past. Agree
that it'll foil the less seasoned DBA.
Yep and we do stash the labelfile output from pg_stop_backup in the
snap after it's mounted on whatever box.
> Also, you'd have to craft a pretty ugly rsync to make it not copy data
> from the unlogged tables, and when you end up with a logged table in
> that tablespace (and you will, unless you're actively monitoring for it
> and remove any that pop up...) you could end up with data corruption.
We are... but indeed the rsync include/exclude rules were fussy to
develop and IMO non-trivial to really verify, certainly not ideal.
For that matter, the event trigger that disallows creating logged
tables I'm not certain is foolproof.
> Don't think that you'll be able to use delta rsyncs with this either, as
> you may end up with files being changed without rsync realizing it and
> again end up with corruption.
We don't do that anyhow.
The snaps I'm talking about here are instantiated only once, then
written to tape or brought up as non-prod systems... then torn all the
way down.
> And would also like to know if perhaps *only* the directories under
> the rsync'd tablespace actually must be present for a successful
> recovery.
Forget I ever said the above which I'm pretty sure isn't workable and
wouldn't buy much anyhow.
> The above approach has worked numerous times even with origin
> systems having large, churny contents in the dedicated unlogged
> tablespace (which is on a much faster local NVME volume than the
> main SAN volume.)
> Dedicated backup tools already know how to recognize unlogged tables and
> skip them, along with being able to do incremental backup and delta
> restores, I'd strongly suggest you consider using such tools instead of
> trying to hack your own.
Point very well taken and it's about time my site had another look to
see if such a solution is a good fit and /or to reverify that our
backups are consistently recoverable.
Thx again for weighing in.
> Thanks,
> Stephen
From | Date | Subject | |
---|---|---|---|
Next Message | Ram Pratap Maurya | 2022-04-19 03:29:49 | RE: Huge archive log generate in Postgresql-13 |
Previous Message | Tom Lane | 2022-04-18 21:08:14 | Re: No psql md5 auth, psql 14.1 to PG 11 |