From: | Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> |
---|---|
To: | tomahkvt(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #18433: Logical replication timeout |
Date: | 2024-04-17 10:48:44 |
Message-ID: | CANhcyEVo5hjhtqe8qyWRwpWgfgK4C1j0++zVZMkVtdTMtcdLCg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Hi,
On Mon, 15 Apr 2024 at 13:09, PG Bug reporting form
<noreply(at)postgresql(dot)org> wrote:
>
> The following bug has been logged on the website:
>
> Bug reference: 18433
> Logged by: Kostiantyn
> Email address: tomahkvt(at)gmail(dot)com
> PostgreSQL version: 13.14
> Operating system: AWS RDS
> Description:
>
> On Postgresql 10 we used the following approach for the big tables:
> 1) Download schema from the source database instance
> 2) Deleted PK, FK, and Indexes for tables bigger than 100Gb from the
> schema
> 3)Upload the schema to the destination DB.
> 4) Configure Logical replication between source and destination DB.
> 5) When last table logical replication table synchronization worker for
> subscription "db_name_public_subscription", table "table_name" has
> finished
> 6) we created all the necessary PK, FK, and Indexes.
> This approach allowed to us upload data more quickly. This approach was
> working great on PostgreSQL 10.
>
> We tried the same approach for Postgresql13, but we got an error.
> 1) Download schema from the source database instance
> 2) Deleted PK, FK, and Indexes for tables bigger than 100Gb from the
> schema
> 3)Upload the schema to the destination DB.
> 4) configurated identity replication full at source DB for tables bigger
> than 100Gb
> 5) Configured Logical replication between source and destination DB.
> 6) During catchup on this big table process we got the following
> messages:
> Source DB
> 2024-04-08 15:38:34 UTC:(27994):replication_role@:[22047]:LOG: terminating
> walsender process due to replication timeout
> 2024-04-08 15:38:34 UTC:(27994):replication_role@:[22047]:CONTEXT: slot
> "db_name_public_subscription", output plugin "pgoutput", in the begin
> callback, associated LSN 13705/2E913050
> 2024-04-08 15:38:34 UTC:(27994):replication_role@:[22047]:STATEMENT:
> START_REPLICATION SLOT "db_name_public_subscription" LOGICAL 13702/C2C8FB30
> (proto_version '1', publication_names '"db_name_public_publication"')
> 2024-04-08 15:38:34 UTC:(36862):replication_role@:[22811]:LOG: terminating
> walsender process due to replication timeout
> 2024-04-08 15:38:34 UTC:(36862):replication_role@:[22811]:CONTEXT: slot
> "db_name_public_subscription_18989108_sync_17127", output plugin "pgoutput",
> in the begin callback, associated LSN 13703/27942B70
> 2024-04-08 15:38:34 UTC:(36862):replication_role@:[22811]:STATEMENT:
> START_REPLICATION SLOT "db_name_public_subscription_18989108_sync_17127"
> LOGICAL 13703/17622B58 (proto_version '1', publication_names
> '"db_name_public_publication"')
>
> One important point. If there is no request to source DB logical replication
> works fine for big tables.
> I saw the messages in PostgreSQL bugs like
> https://www.postgresql.org/message-id/flat/718213.1601410160%40sss.pgh.pa.us#7e61dd07661901b505bcbd74ce5f5f28
> But I also did some tests and increased wal_keep_size
> and max_slot_wal_keep_size to 1GB. And I set wal_sender_timeout to 1h but
> without success. The setup works in PG 13 only with a small amount of data.
I went through the issue and I think that the given logs are appearing
due to some delay in the Apply Worker process.
I could reproduce it in my development environment by applying delays
in Apply Worker.
I think this issue can be resolved by setting 'wal_sender_timeout' to
a greater value. Can you try setting 'wal_sender_timeout' to a greater
value?
Also, I noticed that while using Postgresql13 you are configuring the
table in sourceDB as REPLICA IDENTITY FULL but not doing the same in
Postgresql10. Is there any specific reason for it?
I pointed it out because REPLICA IDENTITY FULL has a different
behaviour and sets the entire row as key.
Thanks and Regards,
Shlok Kyal
From | Date | Subject | |
---|---|---|---|
Next Message | Myka Dresser | 2024-04-17 14:42:39 | Re: BUG #18411: Unable to create database with owner on AWS RDS |
Previous Message | Noah Misch | 2024-04-16 19:34:02 | Re: relfrozenxid may disagree with row XIDs after 1ccc1e05ae |