From: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Vik Fearing <vik(at)2ndquadrant(dot)fr>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Quorum commit for multiple synchronous replication. |
Date: | 2016-09-28 08:14:41 |
Message-ID: | CAB7nPqQmQfPUzLO7SiYrBrav5SOdy+UrjW-QsHhRKMLtd+wyTQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Sep 24, 2016 at 5:37 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> I still vote for changing behaviour of existing syntax 'k (n1, n2)' to
> quorum commit.
> That is,
> 1. 'First k (n1, n2, n3)' means that the master server waits for ACKs
> from k standby servers whose name appear earlier in the list.
> 2. 'Any k (n1, n2, n3)' means that the master server waits for ACKs
> from any k listed standby servers.
> 3. 'n1, n2, n3' is the same as #1 with k=1.
> 4. '(n1, n2, n3)' is the same as #2 with k=1.
OK, so I have done a review of this patch keeping that in mind as
that's the consensus. I am still getting familiar with the code...
- transactions will wait until all the standby servers which are considered
+ transactions will wait until the multiple standby servers which
are considered
There is no real need to update this sentence.
+ <literal>FIRST</> means to control the standby servers with
+ different priorities. The synchronous standbys will be those
+ whose name appear earlier in this list, and that are both
+ currently connected and streaming data in real-time(as shown
+ by a state of <literal>streaming</> in the
+ <link linkend="monitoring-stats-views-table">
+ <literal>pg_stat_replication</></link> view). Other standby
+ servers appearing later in this list represent potential
+ synchronous standbys. If any of the current synchronous
+ standbys disconnects for whatever reason, it will be replaced
+ immediately with the next-highest-priority standby.
+ For example, a setting of <literal>FIRST 3 (s1, s2, s3, s4)</>
+ makes transaction commits wait until their WAL records are received
+ by three higher-priority standbys chosen from standby servers
+ <literal>s1</>, <literal>s2</>, <literal>s3</> and <literal>s4</>.
It does not seem necessary to me to enter in this level of details:
The keyword FIRST, coupled with an integer number N, chooses the first
N higher-priority standbys and makes transaction commit when their WAL
records are received. For example <literal>FIRST 3 (s1, s2, s3, s4)</>
makes transaction commits wait until their WAL records are received by
the three high-priority standbys chosen from standby servers s1, s2,
s3 and s4.
+ <literal>ANY</> means to control all of standby servers with
+ same priority. The master sever will wait for receipt from
+ at least <replaceable class="parameter">num_sync</replaceable>
+ standbys, which is quorum commit in the literature. The all of
+ listed standbys are considered as candidate of quorum commit.
+ For example, a setting of <literal> ANY 3 (s1, s2, s3, s4)</> makes
+ transaction commits wait until receiving receipts from at least
+ any three standbys of four listed servers <literal>s1</>,
+ <literal>s2</>, <literal>s3</>, <literal>s4</>.
Similarly, something like that...
The keyword ANY, coupled with an integer number N, chooses N standbys
in a set of standbys with the same, lowest, priority and makes
transaction commit when WAL records are received those N standbys. For
example ANY 3(s1, s2, s3, s4) makes transaction commits wait until WAL
records have been received from 3 servers in the set s1, s2, s3 and
s4.
It could be good also to mention that no keyword specified means ANY,
which is incompatible with 9.6. The docs also miss the fact that if a
simple list of servers is given, without parenthesis and keywords,
this is equivalent to FIRST 1.
-synchronous_standby_names = '2 (s1, s2, s3)'
+synchronous_standby_names = 'First 2 (s1, s2, s3)'
Nit here. It may be a good idea to just use upper-case characters in
the docs, or just lower-case for consistency, but not mix both.
Usually GUCs use lower-case characters.
+ when standby is considered as a condidate of quorum commit.</entry>
s/condidate/candidate/
-syncrep_scanner.c: FLEXFLAGS = -CF -p
+syncrep_scanner.c: FLEXFLAGS = -CF -p -i
Hm... Is that actually a good idea? Now "NODE" and "node" are two
different things for application_name, but with this patch both would
have the same meaning. I am getting to think that we could just use
the lower-case characters for the keywords any/first. Is this -i
switch a problem for elements in standby_list?
+ * Calculate the 'pos' newest Write, Flush and Apply positions among
sync standbys.
I don't understand this comment.
+ if (SyncRepConfig->sync_method == SYNC_REP_PRIORITY)
+ got_recptr = SyncRepGetOldestSyncRecPtr(&writePtr, &flushPtr,
+ &applyPtr, &am_sync);
+ else /* SYNC_REP_QUORUM */
+ got_recptr = SyncRepGetNNewestSyncRecPtr(&writePtr, &flushPtr,
+ &applyPtr,
SyncRepConfig->num_sync,
+ &am_sync);
Those could be grouped together, there is no need to have pos as an argument.
+ /* In quroum method, all sync standby priorities are always 1 */
+ if (found && SyncRepConfig->sync_method == SYNC_REP_QUORUM)
+ priority = 1;
This is dead code, SyncRepGetSyncStandbysPriority is not called for
QUORUM. You may want to add an assert in
SyncRepGetSyncStandbysPriority and SyncRepGetSyncStandbysQuorum to be
sure that they are getting called for the correct method.
+ /* Sort each array in descending order to get 'pos' newest element */
+ qsort(write_array, len, sizeof(XLogRecPtr), cmp_lsn);
+ qsort(flush_array, len, sizeof(XLogRecPtr), cmp_lsn);
+ qsort(apply_array, len, sizeof(XLogRecPtr), cmp_lsn);
There is no need to reorder things again and to use arrays, you can
choose the newest LSNs when scanning the WalSnd entries.
--
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | Etsuro Fujita | 2016-09-28 09:20:05 | Re: Push down more full joins in postgres_fdw |
Previous Message | amul sul | 2016-09-28 08:10:44 | Re: Bug in to_timestamp(). |