Re: synchronized standby: committed local and waiting for remote ack

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: qihua wu <staywithpin(at)gmail(dot)com>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: synchronized standby: committed local and waiting for remote ack
Date: 2023-01-14 05:31:28
Message-ID: 920090.1673674288@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

qihua wu <staywithpin(at)gmail(dot)com> writes:
> We are using patroni to set up 1 primary and 5 slaves, and using ANY 2 (*)
> to commit a transaction if any 2 standbys receive the WAL. If there is a
> network partitioning between the primary and the slave, then commit will
> hang from user perspective, but the commit is actually done locally, just
> waiting for remote ack which is not possible because of network split. And
> if patroni promotes a slave to primary, then we will lost data. Do you
> think of a different design: first wait for remote ACK, and then commit
> locally, this will only failed if local commit failed, but local commit
> fail is much rarer than network partitioning in a cloud env: it will only
> fail when IO issue or disk is full. So I am thinking of the possibility of
> switch the order: first wait for remote ACK, and then commit locally.

That just gives you a different set of failure modes. It'd be
particularly bad if you have more than one standby, because you could
easily get into a situation where *none* of the nodes represent truth.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Peter J. Holzer 2023-01-14 05:32:03 Re: Intervals and ISO 8601 duration
Previous Message Adrian Klaver 2023-01-14 05:27:15 Re: Intervals and ISO 8601 duration