Re: Netapp SnapCenter

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Paul Förster <paul(dot)foerster(at)gmail(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, "Wolff, Ken L" <ken(dot)l(dot)wolff(at)lmco(dot)com>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Netapp SnapCenter
Date: 2020-06-18 19:26:42
Message-ID: 20200618192642.GH6680@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Greetings,

* Paul Förster (paul(dot)foerster(at)gmail(dot)com) wrote:
> > On 18. Jun, 2020, at 16:19, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> > I don't know specifically about SnapCenter, but for snapshots in general, it does require backup mode *unless* all your data is on the same disk and you have an atomic snapshot across that disk (in theory it can be on different disk as well, as long as the snapshots in that case are atomic across *all* those disks, not just individually, but that is unusual).
>
> according to what I know from our storage guys, Netapp does atomic snapshots for each volume. We have the database and its corresponding WAL files (pg_wal directory) on the same volume and the archived WALs (archive_command) on another. And the snapshots on those two volumes are not taken at the same time. Currently, the database is set to backup mode (using cron) and the storage guys have a window during which they can take the snapshots.

If the entire database, all tablespaces, and pg_wal, are on the same
volume and the snapshot of the volume is atomic, then you don't actually
need to go through the start/stop backup- a snapshot being restored will
look just like a system crash and PG will just go back to the last
checkpoint and replay the WAL that's in pg_wal and it should reach
consistency and come up.

> > So the upthread suggestion of putting data and wal on different disk and snapshoting them at different times is *NOT* safe. Unless the reference to the directory for the logs means a directory where log files are copied out with archive_command, and it's actually the log archive (in which case it will work, but the recommendation is that the log archive should not be on the same machine).
>
> as I said above, pg_wal is a directory in PGDATA at the default location and WALs are archived using the archive_command to a different volume. So I guess, we should be safe then.

Yes, that should be alright.

> > The normal case is that snapshots are guaranteed to be atomic, and thus this millisecond window cannot appear. But that usually only applies across individual volumes. It's also worth noticing that taking the backups without using the backup mode and archive_command means you cannot use them for PITR, only for restore onto that specific snapshot.
> >
> > While our documentation on backups in general definitely needs improvement, this particular requirement is documented at https://www.postgresql.org/docs/current/backup-file.html.
>
> since we use backup mode, we should be good with PITR too. I didn't test that myself, but a workmate did and he said it worked nicely.
>
> So bottom line, SnapCenter means for PostgreSQL just a plain volume snapshot like we currently do with SnapCreator, using backup mode scripts and "archive_command"-ing WALs to a diffent volume, i.e. no change in strategy. It's just that SnapCenter can't control backup mode as it can with Oracle.

The one issue here is that if you're using the deprecated exxclusive
backup API, then PG will create a backup_label file in the data
directory. If the system reboots while that file exists, there's a good
chance that PG won't start up cleanly since, due to the file existing,
it thinks that it's restoring from a backup when it isn't.

Of course, if you're actually restoring from a backup, then that file is
absolutely critical to have in place, otherwise PG won't realize it's
being restored from a backup and you'll end up with a corrupted database
on restore.

Better is to use the newer non-exclusive API and arrange to collect the
necessary contents of the backup_label file from PG and store that with
the snapshot that you've taken.

Thanks,

Stephen

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ron 2020-06-18 19:30:23 Re: Netapp SnapCenter
Previous Message Adrian Klaver 2020-06-18 19:06:18 Re: create batch script to import into postgres tables