Re: Are ZFS snapshots unsafe when PGSQL is spreading through multiple zpools?

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: HECTOR INGERTO <hector_25e(at)hotmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>
Cc: "pgsql-general(at)postgresql(dot)org <pgsql-general(at)postgresql(dot)org>" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Are ZFS snapshots unsafe when PGSQL is spreading through multiple zpools?
Date: 2023-01-16 13:21:32
Message-ID: e07983a4d1cd0b5df43f2e5fcdfdb6f43ea0eb75.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, 2023-01-16 at 08:41 +0000, HECTOR INGERTO wrote:
> I have understood I shall not do it, but could the technical details be discussed about
> why silent DB corruption can occur with non-atomical snapshots?

The database relies on the data being consistent when it performs crash recovery.
Imagine that a checkpoint is running while you take your snapshot. The checkpoint
syncs a data file with a new row to disk. Then it writes a WAL record and updates
the control file. Now imagine that the table with the new row is on a different
file system, and your snapshot captures the WAL and the control file, but not
the new row (it was still sitting in the kernel page cache when the snapshot was taken).
You end up with a lost row.

That is only one scenario. Many other ways of corruption can happen.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Laurenz Albe 2023-01-16 13:26:02 Re: glibc initdb options vs icu compatibility questions (PG15)
Previous Message Laurenz Albe 2023-01-16 13:11:58 Re: Why is a Read-only Table Gets Autovacuumed "to prevent wraparound"