Re: Support for N synchronous standby servers - take 2

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Beena Emerson <memissemerson(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support for N synchronous standby servers - take 2
Date: 2016-02-05 09:59:42
Message-ID: CAB7nPqQY1gvSPtd6pU_AMnMVLL3DZUDVPdwMAUME6qUEAKg1Aw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 5, 2016 at 12:19 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> On Fri, Feb 5, 2016 at 5:36 PM, Michael Paquier
> <michael(dot)paquier(at)gmail(dot)com> wrote:
> I agree with adding new system catalog to easily checking replication
> status for user. And group name will needed for this.
> What about adding group name with ":" to immediately after set of
> standbys like follows?

This way is fine for me.

> 2[local, 2[london1, london2, london3]:london, (tokyo1, tokyo2):tokyo]
>
> Also, regarding sync replication according to configuration, the view
> I'm thinking is following definition.
>
> =# \d pg_synchronous_replication
> Column | Type | Modifiers
> -------------------------+-----------+-----------
> name | text |
> sync_type | text |
> wait_num | integer |
> sync_priority | inteter |
> sync_state | text |
> member | text[] |
> level | integer |
> write_location | pg_lsn |
> flush_location | pg_lsn |
> apply_location | pg_lsn |
>
> - "name" : node name or group name, or "main" meaning top level node.

Check.

> - "sync_type" : 'priority' or 'quorum' for group node, otherwise NULL.

That would be one or the other.

> - "wait_num" : number of nodes/groups to wait for in this group.

Check. This is taken directly from the meta data.

> - "sync_priority" : priority of node/group in this group. "main" node has "0".
> - the standby is in quorum group always has
> priority 1.
> - the standby is in priority group has
> priority according to definition order.

This is a bit confusing if the same node or group in in multiple
groups. My previous suggestion was to list the elements of the group
in increasing order of priority. That's an important point.

> - "sync_state" : 'sync' or 'potential' or 'quorum'.
> - the standby is in quorum group is always 'quorum'.
> - the standby is in priority group is 'sync'
> / 'potential'.

potential and quorum are the same thing, no? The only difference is
based on the group type here.

> - "member" : array of members for group node, otherwise NULL.

This can be NULL only when the entry is a node.

> - "level" : nested level. "main" node is level 0.

Not sure this one is necessary.

> - "write/flush/apply_location" : group/node calculated LSN according
> to configuration.

This does not need to be part of this catalog, that's a representation
of the data that is part of the WAL sender.
--
Michael

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2016-02-05 10:04:07 Fix for OpenSSL error queue bug
Previous Message Michael Paquier 2016-02-05 09:40:59 Re: silent data loss with ext4 / all current versions