From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
Cc: | Xin Zhang <xzhang(at)pivotal(dot)io>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, Ashwin Agrawal <aagrawal(at)pivotal(dot)io>, Asim Praveen <apraveen(at)pivotal(dot)io> |
Subject: | Re: Commits don't block for synchronous replication |
Date: | 2017-09-20 02:07:31 |
Message-ID: | CAD21AoDLU3pK7boR10sTq0uWfzhu49TJtm62uJkx271YTkNQxg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Sep 19, 2017 at 2:26 PM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com> wrote:
> On Tue, Sep 19, 2017 at 7:50 AM, Xin Zhang <xzhang(at)pivotal(dot)io> wrote:
>> If primary crashed at that moment, and failover to standby, the foo table is
>> lost, even though the replication is synced according to
>> `pg_stat_replication` view.
>
> GUC parameters are reloaded each time a query is run, and so
> SyncRepConfig is filled with the parsed data of SyncRepStandbyNames
> once the parameter is reloaded for the process. Still, here, a commit
> is waiting for a signal from a WAL sender that the wanted LSN has been
> correctly flushed on a standby so this code path does not care about
> the state of SyncRepConfig saved in the context of the process, we
> want to know what the checkpointer thinks about it. Hence using WAL
> sender data or sync_standbys_defined as a source of truth looks like a
> correct concept to me, making the problem of this bug legit.
>
I agree with the analysis and the approach of this patch.
> The check with SyncRepRequested() still holds truth: max_wal_senders
> needs a restart to be updated. Also, the other caller of
> SyncStandbysDefined() requires SyncRepConfig to be set, so this caller
> is fine.
Yeah, after reloaded the config there might be some wal senders that
don't reflect it yet but I think it cannot be a problem.
> I have looked at your patch and tested it, but found no problems
> associated with it. A backpatch would be required, so I have added an
> entry in the next commit fest with status set to "ready for committer"
> so as this bug does not fall into the cracks.
Also I found no problems in the patch.
>
>> A separate question, is the `pg_stat_replication` view the reliable way to
>> find when to failover to a standby, or there are some other ways to ensure
>> the standby is in-sync with the primary?
>
> It shows at SQL level what is currently present in shared memory by
> scanning all the WAL sender entries, so this report uses the same data
> as the backend themselves, so that's a reliable source. In Postgres
> 10, pg_stat_activity is also able to show to users what are the
> backends waiting for a change to be flushed/applied on the standby
> using the wait event called SyncRep. You could make some use of that
> as well.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2017-09-20 02:14:32 | Re: sync process names between ps and pg_stat_activity |
Previous Message | Robert Haas | 2017-09-20 02:01:54 | Re: Boom filters for hash joins (was: A design for amcheck heapam verification) |