Re: BUG #17524: Increase in WAL size due to logical replication with publication contain a table with low activity.

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: raman(dot)kumar(at)r-indventures(dot)com, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #17524: Increase in WAL size due to logical replication with publication contain a table with low activity.
Date: 2022-06-21 13:38:36
Message-ID: CAA4eK1+BJDrwp=jWGcXY6iuicnc1Y5ACNNa5hQBcNFykEfaCCA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, Jun 20, 2022 at 7:07 PM PG Bug reporting form
<noreply(at)postgresql(dot)org> wrote:
>
> The following bug has been logged on the website:
>
> Bug reference: 17524
> Logged by: Raman Kumar
> Email address: raman(dot)kumar(at)r-indventures(dot)com
> PostgreSQL version: 13.6
> Operating system: Ubuntu
> Description:
>
> We are using logical replication to source the event from postgres. and
> publication includes a single table with (Insert, update, delete and
> truncate ).
> Everything works great, Until we run into a strange issue i.e. Increase is
> the disk usage, abnormally.
> On looking further we have found that some of the disk space occupy by the
> wal.
> Using this query
> ```
> Select slot_name, pg_current_wal_lsn(), confirmed_flush_lsn,
> (pg_current_wal_lsn() - confirmed_flush_lsn) as lag from
> pg_replication_slots;
> ```
> ```
> "slot_name" "pg_current_wal_lsn" "confirmed_flush_lsn" "lsn_distance"
> "poc_kafka_connect" "109C/AC001020" "109C/94909990" 393180816
> ```
>
> ```
> select usename, client_addr, pg_current_wal_lsn(), flush_lsn,
> (pg_current_wal_lsn() - flush_lsn) as lag from pg_stat_replication;
> ```
>
> ```
> "usename" "client_addr" "pg_current_wal_lsn" "flush_lsn" "lag"
> "postgres" "x.x.x.x" "109C/AC0011A0" "109C/94909990" 393181200
> ```
>
> We have notice the replication lag. But as the publication includes on the
> single table, the last changes are already replicated/published.
> So Technically there is no replication lag as per publication.
>

The lag is computed based on the difference between the WAL location
written in the publisher and the WAL location confirmed by the
subscriber as replayed. Normally for non-published data, the latest
WAL location is confirmed by the subscriber either because we send an
empty transaction (pair of BEGIN ... COMMIT commands) or via
keep-alive messages. Now, it is not clear why none of those is working
in your case? Are there any large transactions happening on the
non-published tables? You can once check data/pg_replslot directory to
confirm if there are any spill files present there?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Nathan Bossart 2022-06-21 17:56:16 Re: Extension pg_trgm, permissions and pg_dump order
Previous Message Kyotaro Horiguchi 2022-06-21 08:46:54 Re: Using PQexecQuery in pipeline mode produces unexpected Close messages