From: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
Cc: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Measuring replay lag |
Date: | 2017-03-05 07:31:42 |
Message-ID: | CANP8+jJ6pkZjXEccZe+wGECUoEmLCt2bUxnM7mXGO=bAmeKknw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 1 March 2017 at 10:47, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> On Fri, Feb 24, 2017 at 9:05 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> On 21 February 2017 at 21:38, Thomas Munro
>> <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>>> However, I think a call like LagTrackerWrite(SendRqstPtr,
>>> GetCurrentTimestamp()) needs to go into XLogSendLogical, to mirror
>>> what happens in XLogSendPhysical. I'm not sure about that.
>>
>> Me neither, but I think we need this for both physical and logical.
>>
>> Same use cases graphs for both, I think. There might be issues with
>> the way LSNs work for logical.
>
> This seems to be problematic. Logical peers report LSN changes for
> all three operations (write, flush, commit) only on commit. I suppose
> that might work OK for synchronous replication, but it makes it a bit
> difficult to get lag measurements that don't look really strange and
> sawtoothy when you have long transactions, and overlapping
> transactions might interfere with the measurements in odd ways. I
> wonder if the way LSNs are reported by logical rep would need to be
> changed first. I need to study this some more and would be grateful
> for ideas from any of the logical rep people.
I have no doubt there are problems with the nature of logical
replication that affect this. Those things are not the problem of this
patch but that doesn't push everything away.
What we want from this patch is something that works for both, as much
as that is possible.
With that in mind, this patch should be able to provide sensible lag
measurements from a simple case like logical replication of a standard
pgbench run. If that highlights problems with this patch then we can
fix them here.
Thanks
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Robins Tharakan | 2017-03-05 07:36:00 | Re: Allow pg_dumpall to work without pg_authid |
Previous Message | Simon Riggs | 2017-03-05 07:20:19 | Re: dropping partitioned tables without CASCADE |