From: | Maksim Milyutin <milyutinma(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Disallow cancellation of waiting for synchronous replication |
Date: | 2019-12-25 09:34:22 |
Message-ID: | f3ffc220-e601-cc43-3784-f9bba66dc382@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 21.12.2019 00:19, Tom Lane wrote:
>> Three is still a problem when backend is not canceled, but terminated [2].
> Exactly. If you don't have a fix that handles that case, you don't have
> anything. In fact, you've arguably made things worse, by increasing the
> temptation to terminate or "kill -9" the nonresponsive session.
I assume that the termination of backend that causes termination of
PostgreSQL instance in Andrey's patch proposal have to be resolved by
external HA agents that could interrupt such terminations as parent
process of postmaster and make appropriate decisions e.g., restart
PostgreSQL node in closed from external users state (via pg_hba.conf
manipulation) until all sync replicas synchronize changes from master.
Stolon HA tool implements this strategy [1]. This logic (waiting for
all replicas declared in synchronous_standby_names replicate all WAL
from master) could be implemented inside PostgreSQL kernel after start
recovery process before database is opened to users and this can be done
separately later.
Another approach is to implement two-phase commit over master and sync
replicas (as it did Oracle in old versions [2]) where the risk to get
local committed data under instance restarting and query canceling is
minimal (after starting of final commitment phase). But this approach
has latency penalty and complexity to resolve partial (prepared but not
committed) transactions under coordinator (in this case master node)
failure in automatic mode. Nicely if this approach will be implemented
later as option of synchronous commit.
2.
https://docs.oracle.com/cd/B28359_01/server.111/b28326/repmaster.htm#i33607
--
Best regards,
Maksim Milyutin
From | Date | Subject | |
---|---|---|---|
Next Message | Maksim Milyutin | 2019-12-25 10:28:39 | Re: Disallow cancellation of waiting for synchronous replication |
Previous Message | Amit Langote | 2019-12-25 08:47:38 | Re: table partition and column default |