Re: Postgres Logical Replication - how to see what subscriber is doing with received data?

From: Shaheed Haque <shaheedhaque(at)gmail(dot)com>
To: Michael Jaskiewicz <mjaskiewicz(at)ghx(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Postgres Logical Replication - how to see what subscriber is doing with received data?
Date: 2024-09-01 16:22:01
Message-ID: CAHAc2je=cCww1xNmZ1TpZG6vwOxut8htJ7NdWotGjrHGuj8ELg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Since nobody more knowledgeable has replied...

I'm very interested in this area and still surprised that there is no
official/convenient/standard way to approach this (see
https://www.postgresql.org/message-id/CAHAc2jdAHvp7tFZBP37awcth%3DT3h5WXCN9KjZOvuTNJaAAC_hg%40mail.gmail.com
).

Based partly on that thread, I ended up with a script that connects to both
ends of the replication, and basically loops while comparing the counts in
each table.

On Fri, 30 Aug 2024, 12:38 Michael Jaskiewicz, <mjaskiewicz(at)ghx(dot)com> wrote:

> I've got two Postgres 13 databases on AWS RDS.
>
> - One is a master, the other a slave using logical replication.
> - Replication has fallen behind by about 350Gb.
> - The slave was maxed out in terms of CPU for the past four days
> because of some jobs that were ongoing so I'm not sure what logical
> replication was able to replicate during that time.
> - I killed those jobs and now CPU on the master and slave are both low.
> - I look at the subscriber via `select * from pg_stat_subscription;`
> and see that latest_end_lsn is advancing albeit very slowly.
> - The publisher says write/flush/replay lags are all 13 minutes behind
> but it's been like that for most of the day.
> - I see no errors in the logs on either the publisher or subscriber
> outside of some simple SQL errors that users have been making.
> - CloudWatch reports low CPU utilization, low I/O, and low network.
>
>
>
> Is there anything I can do here? Previously I set wal_receiver_timeout
> timeout to 0 because I had replication issues, and that helped things. I
> wish I had *some* visibility here to get any kind of confidence that it's
> going to pull through, but other than these lsn values and database logs,
> I'm not sure what to check.
>
>
>
> Sincerely,
>
> mj
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2024-09-01 16:22:31 Re: Upgrade Ubuntu 22 -> 24 may break PostgreSQL
Previous Message Adrian Klaver 2024-09-01 16:08:21 Re: Upgrade Ubuntu 22 -> 24 may break PostgreSQL