Re: Replication lag from transaction logs

From: Debraj Manna <subharaj(dot)manna(at)gmail(dot)com>
To: Keith <keith(at)keithf4(dot)com>
Cc: Fabio Pardi <f(dot)pardi(at)portavita(dot)eu>, pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: Re: Replication lag from transaction logs
Date: 2018-06-18 15:56:55
Message-ID: CAF6DVKOJyk_ARY+yKhFvMh_awvu09XHXMJp+JLzZtHaEr5vFxw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Thanks Keith this is useful.

One more query if I need to know that if a have fallen too far behind and
the WAL is not available. I guess I can do this. Let me know if my
understanding is correct.

- Run pg_controldata <DATA_DIR> on the slave node which has been down
for long.
- It will output the details about the WAL along with the WAL file name
from where it will start the replication. (Field to look for in the output–
'Latest checkpoint's REDO WAL file')
- Then check if the file mentioned in ` 'Latest checkpoint's REDO WAL
file' is present in master `pg_wal` directory. If not then slave have
fallen too far behind and will not be able to recover from WAL.

On Mon, Jun 18, 2018 at 7:39 PM, Keith <keith(at)keithf4(dot)com> wrote:

> The portion before the slash is the actual WAL file itself that it got to
> start replaying, usually the last few characters of the WAL file name. The
> portion after the slash is the position in the WAL file itself. Default
> size of each WAL is 16MB, so if that first portion doesn't match the
> primary, you can count back that many WAL files x16MB. If it's on the same
> WAL file, I'm not really sure how to get a more precise byte lag than that.
>
> Keith
>
>
> On Mon, Jun 18, 2018 at 9:59 AM, Debraj Manna <subharaj(dot)manna(at)gmail(dot)com>
> wrote:
>
>> Thanks Keith I am looking something like the solution you suggested.
>>
>> Once I run pg_controldata on the slave data directory it will be give me
>> the latest checkpoint like below. Can you let me know a bit more detail on
>> how can I get the approximate lag in the slave in bytes from this?
>>
>> Latest checkpoint location: 2F9/B38DE758
>>
>>
>>
>>
>> On Mon 18 Jun, 2018, 6:53 PM Keith, <keith(at)keithf4(dot)com> wrote:
>>
>>> You can use pg_controldata on any data directory to get its rough
>>> position in WAL replay, even when the database isn't running
>>>
>>> https://www.postgresql.org/docs/current/static/app-pgcontroldata.html
>>>
>>> On Mon, Jun 18, 2018 at 8:56 AM, Fabio Pardi <f(dot)pardi(at)portavita(dot)eu>
>>> wrote:
>>>
>>>> That I know, if standby is down, you cannot know exactly the last
>>>> applied transaction log file.
>>>>
>>>> But you could get a gross estimate if you check how many WAL files is
>>>> the standby behind, just listing the WAL files on the master and on
>>>> the standby.
>>>>
>>>> Given that you can login on the standby machine.
>>>>
>>>> regards,
>>>>
>>>> fabio pardi
>>>>
>>>>
>>>> On 18/06/18 14:43, Debraj Manna wrote:
>>>>
>>>> Is it possible to figure out the replication lag from transaction log
>>>> files in the slave and master without querying postgres on the slave?
>>>> Basically figuring out the replication lag in a slave when the postgres is
>>>> down in that slave.
>>>>
>>>>
>>>>
>>>
>

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Scott Ribe 2018-06-18 17:03:27 Re: Replication lag from transaction logs
Previous Message Keith 2018-06-18 14:09:24 Re: Replication lag from transaction logs