Re: Help diagnosing replication (copy) error

From: Steve Baldwin <steve(dot)baldwin(at)gmail(dot)com>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Cc: "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Help diagnosing replication (copy) error
Date: 2024-03-08 22:04:37
Message-ID: CAKE1AibE1h=RVfV1GVCAQodGPB9bRuC096PLKf0S4ad1TrGkuA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sat, Mar 9, 2024 at 8:56 AM Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
wrote:

>
> What are the rest of the values in pg_replication_slots?
>
> b2bcreditonline=> select * from pg_replication_slots;
slot_name | plugin | slot_type |
datoid | database | temporary | active | active_pid | xmin |
catalog_xmin | restart_lsn | confirmed_flush_lsn | wal_status |
safe_wal_size | two_phase
---------------------------------------------+----------+-----------+--------+-----------------+-----------+--------+------------+------+--------------+---------------+---------------------+------------+---------------+-----------
b2bcreditonline_prod_e_master | pgoutput | logical |
16404 | b2bcreditonline | f | t | 13700 | |
456150233 | 2E2F/AC07B760 | 2E2F/AEEEA478 | reserved |
| f
b2bcreditonline_prod_sandbox_d_master | pgoutput | logical |
16404 | b2bcreditonline | f | t | 9232 | |
456150233 | 2E2F/AC07B760 | 2E2F/AEEEA560 | reserved |
| f
b2bcreditonline_prod_e_master_only | pgoutput | logical |
16404 | b2bcreditonline | f | t | 13710 | |
456150233 | 2E2F/AC07B760 | 2E2F/AEEEA560 | reserved |
| f
pg_378075177_sync_60067_7343845372910323059 | pgoutput | logical |
16404 | b2bcreditonline | f | f | | |
455719618 | 2E2A/1C0972E0 | 2E2A/1C097318 | extended |
| f
b2bcreditonline_prod_e_shard | pgoutput | logical |
16404 | b2bcreditonline | f | t | 13718 | |
456150233 | 2E2F/AC07B760 | 2E2F/AEEEA560 | reserved |
| f
(5 rows)

> Is there data in the subscriber side table?
>
> No there is not, although when I did a 'select count(*) from {table}' it
took several minutes to return 0 rows.

> What are the publisher and subscriber configurations?
>
> Not sure which settings, but here's a few.

(publisher - pg 15.3)
b2bcreditonline=> select name, setting from pg_settings where name like
'max\_%' escape '\';
name | setting
-----------------------------------+---------
max_connections | 5000
max_files_per_process | 1000
max_function_args | 100
max_identifier_length | 63
max_index_keys | 32
max_locks_per_transaction | 64
max_logical_replication_workers | 16
max_parallel_maintenance_workers | 2
max_parallel_workers | 8
max_parallel_workers_per_gather | 2
max_pred_locks_per_page | 2
max_pred_locks_per_relation | -2
max_pred_locks_per_transaction | 64
max_prepared_transactions | 0
max_replication_slots | 64
max_slot_wal_keep_size | -1
max_stack_depth | 6144
max_standby_archive_delay | 30000
max_standby_streaming_delay | 30000
max_sync_workers_per_subscription | 4
max_wal_senders | 96
max_wal_size | 4096
max_worker_processes | 32
(23 rows)

(subscriber - pg 16.2)
b2bcreditonline=> select name, setting from pg_settings where name like
'max\_%' escape '\';
name | setting
---------------------------------------------+---------
max_connections | 5000
max_files_per_process | 1000
max_function_args | 100
max_identifier_length | 63
max_index_keys | 32
max_locks_per_transaction | 64
max_logical_replication_workers | 16
max_parallel_apply_workers_per_subscription | 2
max_parallel_maintenance_workers | 2
max_parallel_workers | 8
max_parallel_workers_per_gather | 2
max_pred_locks_per_page | 2
max_pred_locks_per_relation | -2
max_pred_locks_per_transaction | 64
max_prepared_transactions | 0
max_replication_slots | 64
max_slot_wal_keep_size | -1
max_stack_depth | 6144
max_standby_archive_delay | 30000
max_standby_streaming_delay | 30000
max_sync_workers_per_subscription | 4
max_wal_senders | 96
max_wal_size | 4096
max_worker_processes | 32
(24 rows)

>
> >
> > I've checked the recent logs for both the publishing cluster and the
> > subscribing cluster but I can't see any replication errors. I guess I
> > could have missed them, but it doesn't seem like anything is being
> > 'retried' like I've seen in the past with replication errors.
> >
> > I've used this mechanism for zero-downtime upgrades multiple times in
> > the past, and have recently used it to upgrade smaller clusters from
> > 15.x to 16.2 without issue.
> >
> > The clusters are hosted on AWS RDS, so I have no access to the servers,
> > but if that's the only way to diagnose the issue, I can create a support
> > case.
> >
> > Does anyone have any suggestions as to where I should look for the issue?
> >
> > Thanks,
> >
> > Steve
>
> --
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2024-03-08 22:11:15 Re: Question related to partitioning with pg_partman
Previous Message Adrian Klaver 2024-03-08 21:56:29 Re: Help diagnosing replication (copy) error