RE: Are ZFS snapshots unsafe when PGSQL is spreading through multiple zpools?

From: HECTOR INGERTO <hector_25e(at)hotmail(dot)com>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Magnus Hagander <magnus(at)hagander(dot)net>
Cc: "pgsql-general(at)postgresql(dot)org <pgsql-general(at)postgresql(dot)org>" <pgsql-general(at)postgresql(dot)org>
Subject: RE: Are ZFS snapshots unsafe when PGSQL is spreading through multiple zpools?
Date: 2023-01-16 14:37:23
Message-ID: GV1P189MB2036AFE96E22BF5830C1E22CF5C19@GV1P189MB2036.EURP189.PROD.OUTLOOK.COM
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> The database relies on the data being consistent when it performs crash recovery.
> Imagine that a checkpoint is running while you take your snapshot. The checkpoint
> syncs a data file with a new row to disk. Then it writes a WAL record and updates
> the control file. Now imagine that the table with the new row is on a different
> file system, and your snapshot captures the WAL and the control file, but not
> the new row (it was still sitting in the kernel page cache when the snapshot was taken).
> You end up with a lost row.
>
> That is only one scenario. Many other ways of corruption can happen.

Can we say then that the risk comes only from the possibility of a checkpoint running inside the time gap between the non-simultaneous snapshots?

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2023-01-16 14:59:38 Re: row estimate for partial index
Previous Message Erik Wienhold 2023-01-16 14:32:07 Re: Why is a Read-only Table Gets Autovacuumed "to prevent wraparound"