Re: pg_rewind with cascade standby doesn't work well

From: Ilya Gladyshev <ilya(dot)v(dot)gladyshev(at)gmail(dot)com>
To: Kuwamura Masaki <kuwamura(at)db(dot)is(dot)i(dot)nagoya-u(dot)ac(dot)jp>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Subject: Re: pg_rewind with cascade standby doesn't work well
Date: 2024-07-22 21:47:37
Message-ID: 735FA63F-296A-4895-93B6-80F202EC5B1C@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> <v3-0001-pg_rewind-Fix-bug-using-cascade-standby-as-source.patch>

Hi,

Thank you for addressing this issue!

The patch needs to be rebased as it doesn’t apply on master anymore, but here are some thoughts on the patch in general without testing:

1. Regarding the approach to force a checkpoint on every restartpoint record, I wonder if it has any performance implications, since now the WAL replay will wait for the restartpoint to finish as opposed to it happening in the background.
2. This change of behaviour should be documented in [1], there’s a paragraph about restartpoints.
3. It looks like some pg_rewind code accommodating for the "restartpoint < last common checkpoint" situation could be cleaned up as well, I found this at pg_rewind.c:669 on efcbb76efe, but maybe there’s more:

if (ControlFile_source.checkPointCopy.redo < chkptredo) …

There’s also a less invasive option to fix this problem by detecting this situation from pg_rewind and simply calling checkpoint on the standby that I think is worth exploring.

Regards,
Ilya

[1] https://www.postgresql.org/docs/devel/wal-configuration.html

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Matthew Kim 2024-07-22 22:07:34 Re: Remove dependence on integer wrapping
Previous Message Isaac Morland 2024-07-22 21:40:55 Re: [PATCH] GROUP BY ALL