Re: How to Qualifying or quantify risk of loss in asynchronous replication

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: otheus uibk <otheus(dot)uibk(at)gmail(dot)com>
Cc: Forums postgresql <pgsql-general(at)postgresql(dot)org>
Subject: Re: How to Qualifying or quantify risk of loss in asynchronous replication
Date: 2016-03-16 09:33:22
Message-ID: CAEepm=0WicoiJ74MAYe1pcyVeUzvQpYNptYj907=peXbvg2EGw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Mar 16, 2016 at 9:59 PM, otheus uibk <otheus(dot)uibk(at)gmail(dot)com> wrote:
>> In asynchronous replication,
>> the primary writes to the WAL and flushes the disk. Then, for any
>> standbys that happen to be connected, a WAL sender process trundles
>> along behind feeding new WAL doesn the socket as soon as it can, but
>> it can be running arbitrarily far behind or not running at all (the
>> network could be down or saturated, the standby could be temporarily
>> down or up but not reading the stream fast enough, etc etc).
>
>
>
> This is the *process* I want more detail about. The question is the same as
> above:
>> (is it true that) PG async guarantees that the WAL
>> is *sent* to the receivers, but not that they are received, before the
>> client receives acknowledgement?

The primary writes WAL to disk, and then wakes up walsender processes,
and they read the WAL from disk (presumably straight out of the OS
page cache) in the background and send it down the network some time
later. Async replication doesn't guarantee anything about the WAL
being sent.

Look for WalSndWakeupRequest() in xlog.c, which expands to a call to
WalSndWakeup in walsender.c which sets latches (= a mechanism for
waking processes) on all walsenders, and see the WaitLatchOrSocket
calls in walsender.c which wait for that to happen.

--
Thomas Munro
http://www.enterprisedb.com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Thomas Kellerer 2016-03-16 10:44:42 Confusing deadlock report
Previous Message otheus uibk 2016-03-16 09:21:41 Re: How to Qualifying or quantify risk of loss in asynchronous replication