From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | Alexander Kukushkin <cyberdemn(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, Oleksandr Shulgin <oleksandr(dot)shulgin(at)zalando(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Concurrency issue in pg_rewind |
Date: | 2020-09-29 06:49:31 |
Message-ID: | ac2f431b-40dc-ca2b-b8c1-deb3c621b3f6@iki.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 18/09/2020 10:17, Alexander Kukushkin wrote:
> At the same time, pg_rewind due to such "fatal" error leaves PGDATA in
> an inconsistent state with empty pg_control file, this is totally bad
> and easily fixable. We want the specific file to be absent and it is
> already absent, why should it be a fatal error and not warning?
Whenever pg_rewind runs into something unexpected, it fails loudly, so
that the administrator can re-initialize from a base backup. That's the
general rule. If a file goes missing while pg_rewind is running, that is
unexpected. It could be a sign that the server was started concurrently,
or another pg_rewind was started against it, for example.
I feel that we could make an exception of some sort here, but I'm not
sure what exactly. I don't feel comfortable just downgrading the
unexpected ENOENT on unlink() to warning in all cases. Besides, scary
warnings that you routinely ignore is not good either.
I have a hard time coming up with a general rule and justification
that's not just "do X because WAL-G does Y". pg_rewind failing because
WAL-G removed a file unexpectedly is one problem, but another is that
the restore_command might get confused if a pg_rewind removes a file
that restore_command needs. This is hard when restore_command does
things in the background, and there's no communication between the
background process and pg_rewind.
The general principle is that pg_rewind is equivalent to overwriting the
target with the source, only faster. Perhaps pg_wal should be an
exception, and pg_rewind should leave alone any files under pg_wal that
it doesn't recognize.
- Heikki
From | Date | Subject | |
---|---|---|---|
Next Message | Hamid Akhtar | 2020-09-29 07:06:47 | Improved Cost Calculation for IndexOnlyScan |
Previous Message | Masahiko Sawada | 2020-09-29 06:42:01 | Re: Transactions involving multiple postgres foreign servers, take 2 |