From: | Steve Baldwin <steve(dot)baldwin(at)gmail(dot)com> |
---|---|
To: | "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Help diagnosing replication (copy) error |
Date: | 2024-03-08 21:50:47 |
Message-ID: | CAKE1Aib-6yvpd1mvn02haEr=EHYO2Hoq1BLQcw3JE5Ob95jZYw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi,
I'm in the process of migrating a cluster from 15.3 to 16.2. We have a
'zero downtime' requirement so I'm using logical replication to create the
new cluster and then perform the switch in the application.
I have a situation where all but one table have done their initial copy.
The remaining table is the largest (of course), and the replication slot
that is assigned for the copy (pg_378075177_sync_60067_7343845372910323059)
is showing as 'active=false' if I select from pg_replication_slots on the
publisher.
I've checked the recent logs for both the publishing cluster and the
subscribing cluster but I can't see any replication errors. I guess I could
have missed them, but it doesn't seem like anything is being 'retried' like
I've seen in the past with replication errors.
I've used this mechanism for zero-downtime upgrades multiple times in the
past, and have recently used it to upgrade smaller clusters from 15.x to
16.2 without issue.
The clusters are hosted on AWS RDS, so I have no access to the servers, but
if that's the only way to diagnose the issue, I can create a support case.
Does anyone have any suggestions as to where I should look for the issue?
Thanks,
Steve
From | Date | Subject | |
---|---|---|---|
Next Message | Adrian Klaver | 2024-03-08 21:56:29 | Re: Help diagnosing replication (copy) error |
Previous Message | sud | 2024-03-08 18:45:01 | Re: Question related to partitioning with pg_partman |