Re: synchronized standby: committed local and waiting for remote ack

From: qihua wu <staywithpin(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: synchronized standby: committed local and waiting for remote ack
Date: 2023-01-17 03:06:44
Message-ID: CAPoYtoJv8AdOb_64PndVPQ34yu6wd29V+CP8q2GZBjwmg0Cxzg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

How to understand "because you could
easily get into a situation where *none* of the nodes represent truth."?

In current design, when a user commits, it will first commit on primary,
and then is waiting for slave ack. if slaves and primary are splitted in
the network, then the user commit command will hang forever, and usually
the user side has timeout setting, when it timeouts, users will THINK the
commit fails, but actually it commits on primary, so the primary doesn't
reflect the truth. If user commit first waits for slave ack, and then do
local commit, if network partitioning happens, it commits on neither
primary or slave which is good, and if network partitioning doesn't happen,
it will more likely to commit on both primary and locally and local commit
is less likely to error out.

On Sat, Jan 14, 2023 at 1:31 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> qihua wu <staywithpin(at)gmail(dot)com> writes:
> > We are using patroni to set up 1 primary and 5 slaves, and using ANY 2
> (*)
> > to commit a transaction if any 2 standbys receive the WAL. If there is a
> > network partitioning between the primary and the slave, then commit will
> > hang from user perspective, but the commit is actually done locally, just
> > waiting for remote ack which is not possible because of network split.
> And
> > if patroni promotes a slave to primary, then we will lost data. Do you
> > think of a different design: first wait for remote ACK, and then commit
> > locally, this will only failed if local commit failed, but local commit
> > fail is much rarer than network partitioning in a cloud env: it will only
> > fail when IO issue or disk is full. So I am thinking of the possibility
> of
> > switch the order: first wait for remote ACK, and then commit locally.
>
> That just gives you a different set of failure modes. It'd be
> particularly bad if you have more than one standby, because you could
> easily get into a situation where *none* of the nodes represent truth.
>
> regards, tom lane
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message pran d 2023-01-17 06:15:57 pg_stat_all_tables: n_live_tup column value not persisting
Previous Message Ron 2023-01-16 23:13:29 Re: Why is a Read-only Table Gets Autovacuumed "to prevent wraparound"