Re: Issues with two-server Synch Rep

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Issues with two-server Synch Rep
Date: 2010-10-13 07:19:27
Message-ID: AANLkTim-RSBZTYuUxo5=xbv2hNprtL2cUiiXf_36FVVk@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 8, 2010 at 3:05 AM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> Adding a Synch Standby
> -----------------------
> What is the procedure for adding a new synchronous standby in your
> implementation?  That is, how do we go from having a standby server with
> an empty PGDATA to having a working synchronous standby?

In my patch, you still have to take a base backup from the master
and start the standby from that.

> Snapshot Publication
> ---------------------
> During 9.0 development discussion, one of the things we realized we
> needed for synch standby was publication of snapshots back to the master
> in order to prevent query cancel on the standby.  Without this, the
> synch standby is useless for running read queries.  Does your patch
> implement this?

No. I think that this has almost nothing to do with sync rep itself.

To solve this problem, I think that we should implement the mechanism
like UNDO segment on the standby instead of snapshot publication. That
is, the replay of the VACUUM operation copies the old version of tuple
to somewhere instead of removing it immediately, until all the transactions
which can see that have gone away. Though it would be difficult to
implement this.

> Management
> -----------
> One of the serious flaws currently in HS/SR is complexity of
> administration.  Setting up and configuring even a single master and
> single standby requires editing up to 6 configuration files in Postgres,
> as well as dealing with file permissions.  As such, any Synch Rep patch
> must work together with attempts to simplify administration.  How does
> your design do this?

No. What is worse is that my patch introduces new configuration file
standbys.conf.

Aside from the patch, I agree to specify the synchronous standbys in
the existing conf file like postgresql.conf on the master or recovery.conf
on the standby instead of adding new conf file.

> Monitoring
> -----------
> Synch rep offers severe penalties to availability if a synch standby
> gets behind or goes down.  What replication-specific monitoring tools
> and hooks are available to allow administators to take action before the
> database becomes unavailable?

Yeah, if you choose recv or fsync as synchronization level, recovery on
the standby might fall behind the master even in sync rep. This delay
would increase the failover time.

To monitor that, you can use pg_current_xlog_location on the master and
pg_last_xlog_replay_location on the standby. My patch hasn't provided
another monitoring mechanism.

> Degradation
> ------------
> In the event that the synch rep standby falls too far behind or becomes
> unavailable, or is deliberately taken offline, what are you envisioning
> as the process for the DBA resolving the situation?  Is there any
> ability to commit "stuck" transactions?

Since my patch hasn't provided wait-forever option, there is obviously
no capability to resume the transactions which are waiting until the
standby has caught up with.

> Client Consistency
> ---------------------
> With a standby in "apply" mode, and a master failure at the wrong time,
> there is the possibility that the Standby will apply a transaction at
> the same time that the master crashes, causing the client to never
> receive a commit message.  Once the client reconnects to the standby,
> how will it know whether its transaction was committed or not?

This problem can happen even if you don't use replication. So this
should be addressed separately from sync rep.

> As a lesser case, a standby in "apply" mode will show the results of
> committed transactions *before* they are visible on the master.  Is
> there any need to handle this?  If so, how?

The cluster-side snapshot would be required in order to ensure that
all the standbys return the same result. "Export snapshots to other sessions"
feature which was proposed in Cluster Developer Meeting is one step for
that, I think.
http://wiki.postgresql.org/wiki/ClusterFeatures

> Performance
> ------------
> As with XA, synch rep has the potential to be so slow as to be unusable.
>  What optimizations to you make in your approach to synch rep to make it
> faster than two-phase commit?  What other performance optimizations have
> you added?

To allow walsender to send the WAL records which have not been fsync'd yet
(i.e., WAL is written and sent in parallel) would increase the performance
significantly. Obviously my patch has not provided this improvement.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dimitri Fontaine 2010-10-13 07:41:45 Re: SQL command to edit postgresql.conf, with comments
Previous Message Mark Kirkwood 2010-10-13 07:19:26 Re: Slow count(*) again...