From: | Magnus Hagander <magnus(at)hagander(dot)net> |
---|---|
To: | HECTOR INGERTO <HECTOR_25E(at)hotmail(dot)com> |
Cc: | "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Are ZFS snapshots unsafe when PGSQL is spreading through multiple zpools? |
Date: | 2023-01-15 19:36:44 |
Message-ID: | CABUevEwK7mzh6VKmiwQh1JVdhz4+tSxobmjQEnK5=KvTtJY1Xg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Sun, Jan 15, 2023 at 8:18 PM HECTOR INGERTO <HECTOR_25E(at)hotmail(dot)com>
wrote:
> Hello everybody,
>
>
>
> I’m using PostgreSQL on openZFS. I use ZFS snapshots as a backup +
> hotspare method.
>
>
>
> From man zfs-snapshot: “Snapshots are taken atomically, so that all
> snapshots correspond to the same moment in time.” So if a PSQL instance is
> started from a zfs snapshot, it will start to replay the WAL from the last
> checkpoint, in the same way it would do in a crash or power loss scenario. So
> from my knowledge, ZFS snapshots can be used to rollback to a previous
> point in time. Also, sending those snapshots to other computers will allow
> you to have hotspares and remote backups. If I’m wrong here, I would
> appreciate being told about it because I’m basing the whole question on
> this premise.
>
>
>
> On the other hand, we have the tablespace PGSQL feature, which is great
> because it allows “unimportant” big data to be written into cheap HDD and
> frequently used data into fast NVMe.
>
>
>
> So far, so good. The problem is when both ideas are merged. Then,
> snapshots from different pools are NOT atomical, snapshot on the HDD pool
> isn’t going to be done at the same exact time as the one on the SSD pool,
> and I don’t know enough about PGSQL internals to know how dangerous this
> is. So here is where I would like to ask for your help with the following
> questions:
>
>
>
> First of all, what kind of problem can this lead to? Are we talking about
> potential whole DB corruption or only the loss of a few of the latest
> transactions?
>
Silent data corruption. *not* just losing your latest transaction.
> In second place, if I’m initializing a corrupted PGSQL instance because
> ZFS snapshots are from different pools and slightly different times, am I
> going to notice it somehow or is it going to fail silently?
>
Silent. You might notice at the application level. Might.
> In third and last place, is there some way to quantify the amount of risk
> taken when snapshotting a PGSQL instance spread across two (or more)
> different pools?
>
>
>
"Don't do it".
If you can't get atomic snapshots, don't do it, period.
You can use them together with a regular online backup. That is
pg_start_backup() // <snapshot multiple volumes> // pg_stop_backup()
together with log archiving. That's a perfectly valid method. But you
cannot and should not rely on snapshots alone.
--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2023-01-15 19:47:35 | Re: pg_upgrade 13.6 to 15.1? |
Previous Message | pf | 2023-01-15 19:27:32 | pg_upgrade 13.6 to 15.1? |