Re: monitoring bdr nodes

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Dennis <dennisr(at)visi(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: monitoring bdr nodes
Date: 2015-04-20 12:25:59
Message-ID: CAMsr+YHV7=vgJzLFXb8M8usnSssa9MDvJD3g+Mem4QVUWhwVWw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 16 April 2015 at 23:58, Dennis <dennisr(at)visi(dot)com> wrote:
> I need some clarification on how to monitor BDR nodes. In particular determining replication lag. As an example, I have a two node cluster with nodes ‘A’ and ‘B’. I need to be able to look at node ‘B’ and determine if it is lagging behind node ‘A’, by interrogating node ‘B’ only.

You can't, that doesn't really make sense - in BDR, or in regular
PostgreSQL streaming replication.

For that to be possible, node 'B' would need some side-channel by
which it found out the current WAL insert position of node 'A'. Which
effectively means communicating in real time with node 'A'... so the
client might as well do it instead. We can't do this effectively on
the walsender stream without some kind of interrupt message that can
be priority-injected into the stream, and even then it wouldn't help
if the issue was packet loss causing connection issues, etc.

If you're in a position where node 'B' can make direct libpq
non-replication connections to 'A' but the client can't, you could use
postgres_fdw to expose a view of node A's
pg_current_xlog_insert_location(), plus the pg_replication_slots and
pg_stat_replication views. That seems a bit of an odd situation to me,
though.

> Because it is querying the pg_stat_replication table, I will need to run this query on node ‘A’ to check the lag on node ‘B’, is that true?

Correct. I'll make the docs more explicit about that.

> I need to be able run a query on node ‘B’ to determine if it node ‘B’ is behind. I am not sure the above query will work for that use case.

It won't, and you really can't.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Merlin Moncure 2015-04-20 14:57:17 Re: "Cast" SRF returning record to a table type?
Previous Message Jim Nasby 2015-04-20 05:25:12 Re: Help with slow table update