Re: Incremental backup from a streaming replication standby fails

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Cc: David Steele <david(at)pgmasters(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Incremental backup from a streaming replication standby fails
Date: 2024-07-22 13:37:26
Message-ID: CA+TgmoYY=xWk_FFuRtpUe1VZv7jKkskwA8CCPFxeFt2dbLvEOg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 19, 2024 at 6:07 PM Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> wrote:
> Here is a patch.
> I went for both the errhint and some documentation.

Hmm, the hint doesn't end up using the word "standby" anywhere. That
seems like it might not be optimal?

+ Like a base backup, you can take an incremental backup from a streaming
+ replication standby server. But since a backup of a standby server cannot
+ initiate a checkpoint, it is possible that an incremental backup taken
+ right after a base backup will fail with an error, since it would have
+ to start with the same checkpoint as the base backup and would therefore
+ be empty.

Hmm. I feel like I'm about to be super-nitpicky, but this seems
imprecise to me in multiple ways. First, an incremental backup is a
kind of base backup, or at least, it's something you take with
pg_basebackup. Note that later in the paragraph, you use the term
"base backup" to refer to what I have been calling the "prior" or
"previous" backup or "the backup upon which it depends," but that
earlier backup could be either a full or an incremental backup.
Second, the standby need not be using streaming replication, even
though it probably will be in practice. Third, the failing incremental
backup doesn't necessarily have to be attempted immediately after the
previous one - the intervening time could be quite long on an idle
system. Fourth, it makes it sound like the backup being empty is a
reason for it to fail, which is debatable; I think we should try to
cast this more as an implementation restriction.

How about something like this:

An incremental backup is only possible if replay would begin from a
later checkpoint than for the previous backup upon which it depends.
On the primary, this condition is always satisfied, because each
backup triggers a new checkpoint. On a standby, replay begins from the
most recent restartpoint. As a result, an incremental backup may fail
on a standby if there has been very little activity since the previous
backup. Attempting to take an incremental backup that is lagging
behind the primary (or some other standby) using a prior backup taken
at a later WAL position may fail for the same reason.

I'm not saying that's perfect, but let me know your thoughts.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2024-07-22 13:44:44 Re: Windows default locale vs initdb
Previous Message Melanie Plageman 2024-07-22 13:32:12 Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin