BUG #18815: Logical replication worker Segmentation fault

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: sergey(dot)belyashov(at)gmail(dot)com
Subject: BUG #18815: Logical replication worker Segmentation fault
Date: 2025-02-17 10:52:32
Message-ID: 18815-2a0407cc7f40b327@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

The following bug has been logged on the website:

Bug reference: 18815
Logged by: Sergey Belyashov
Email address: sergey(dot)belyashov(at)gmail(dot)com
PostgreSQL version: 17.3
Operating system: Debian bookworm x86_64
Description:

Today I try to upgrade my cluster from postgresql-16 to postgresql-17. And
it was successfull until I restore some logical replication subscriptions.
When subscription is activated and first data are come then server logs:
2025-02-17 13:34:08.975 [98417] LOG: logical replication apply worker for
subscription "node4_closed_sessions_sub" has started
2025-02-17 13:34:11.213 [62583] LOG: background worker "logical
replication apply worker" (PID 98417) was terminated by signal 11:
Segmentation fault
2025-02-17 13:34:11.213 [62583] LOG: terminating any other active server
processes
2025-02-17 13:34:11.240 [62583] LOG: all server processes terminated;
reinitializing
2025-02-17 13:34:11.310 [98418] LOG: database system was interrupted; last
known up at 2025-02-17 13:22:08
and then restarts.
Kernel has been logged following info:
[94740743.468001] postgres[98417]: segfault at 10 ip 0000562b2b74d69c sp
00007fff284a7320 error 4 in postgres[562b2b6bb000+595000]
[94740743.468173] Code: 1f 80 00 00 00 00 44 89 e0 48 8b 15 56 0b 82 00 f7
d0 48 98 4c 8b 3c c2 eb 99 0f 1f 40 00 55 48 89 e5 53 48 89
fb 48 83 ec 08 <8b> 7f 10 e8 4c b1 32 00 8b 7b 14 85 ff 75 15 48 89 df 48
8b 5d f8

After some investigations I found that segfault is caused by one type of
subscriptions: subscription for huge partitioned tables on publisher and
subscriber (via root), subscriptions are created with data_copy=false
(source table updated by inserts and partition detaches, and it is huge,
data transfer is not compressed so it may take a days). Segfault does not
come immediately after subscription creation, but it cause when data is come
from the publisher. Then subscriber is restarts, recover, run subscription
again, catch segfault and repeat again until subscription is disabled.

Subscriptions for tables (small) without partitions works fine.

There is difference for publisher server versions: both publishers 16 and 17
cause the segfault on subscriber (version 17.3).

postgresql versions 12-16 works for years without any segfault with same
partition tables and publications/subscriptions.
postgresql-17=17.3-3.pgdg120+1 installed from the repository:
http://apt.postgresql.org/pub/repos/apt/ bookworm-pgdg main

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Boris P. Korzun 2025-02-17 10:58:12 Re: BUG #18812: Conditional rule: inconsistent check for statement
Previous Message PG Bug reporting form 2025-02-17 07:27:50 BUG #18814: cannot read properties of undefined (reading 'notifies')

Browse pgsql-hackers by date

  From Date Subject
Next Message Shubham Khanna 2025-02-17 10:53:30 Re: Log a warning in pg_createsubscriber for max_slot_wal_keep_size
Previous Message Ajin Cherian 2025-02-17 10:50:11 Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding