Re: BUG #18815: Logical replication worker Segmentation fault

From: Sergey Belyashov <sergey(dot)belyashov(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18815: Logical replication worker Segmentation fault
Date: 2025-02-17 19:17:05
Message-ID: CAOe0RDwbowSGv8qeuqyHzJndtwV5JyPEG4YwmaAvgquC=gjz+Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Hi,

I think backtrace will help.
Core was generated by `postgres: 17/main: logical replication apply
worker for subscription 602051860'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00005635402c869c in brinRevmapTerminate (revmap=0x0)
at ./build/../src/backend/access/brin/brin_revmap.c:102
(gdb) backtrace
#0 0x00005635402c869c in brinRevmapTerminate (revmap=0x0)
at ./build/../src/backend/access/brin/brin_revmap.c:102
#1 0x00005635402bfddd in brininsertcleanup (index=<optimized out>,
indexInfo=<optimized out>)
at ./build/../src/backend/access/brin/brin.c:515
#2 0x0000563540479309 in ExecCloseIndices
(resultRelInfo=resultRelInfo(at)entry=0x563541cab8d0)
at ./build/../src/backend/executor/execIndexing.c:248
#3 0x000056354048067f in ExecCleanupTupleRouting
(mtstate=0x563541c6db58, proute=0x563541cab638)
at ./build/../src/backend/executor/execPartition.c:1270
#4 0x00005635405c89f7 in finish_edata (edata=0x563541ca0fa8)
at ./build/../src/backend/replication/logical/worker.c:718
#5 0x00005635405cc6c4 in apply_handle_insert (s=0x7f61d2a3a1d8)
at ./build/../src/backend/replication/logical/worker.c:2438
#6 apply_dispatch (s=s(at)entry=0x7ffd30d95a70) at
./build/../src/backend/replication/logical/worker.c:3296
#7 0x00005635405cdb7f in LogicalRepApplyLoop (last_received=106949425100872)
at ./build/../src/backend/replication/logical/worker.c:3587
#8 start_apply (origin_startpos=origin_startpos(at)entry=0)
at ./build/../src/backend/replication/logical/worker.c:4429
#9 0x00005635405ce11f in run_apply_worker () at
./build/../src/backend/replication/logical/worker.c:4550
#10 ApplyWorkerMain (main_arg=<optimized out>) at
./build/../src/backend/replication/logical/worker.c:4719
#11 0x0000563540594bf8 in BackgroundWorkerMain (startup_data=<optimized out>,
startup_data_len=<optimized out>) at
./build/../src/backend/postmaster/bgworker.c:848
#12 0x0000563540596daa in postmaster_child_launch
(child_type=child_type(at)entry=B_BG_WORKER,
startup_data=startup_data(at)entry=0x563541bc3618 "",
startup_data_len=startup_data_len(at)entry=1472,
client_sock=client_sock(at)entry=0x0) at
./build/../src/backend/postmaster/launch_backend.c:277
#13 0x0000563540598f88 in do_start_bgworker (rw=0x563541bc3618)
at ./build/../src/backend/postmaster/postmaster.c:4272
#14 maybe_start_bgworkers () at
./build/../src/backend/postmaster/postmaster.c:4503
#15 0x0000563540599fea in process_pm_pmsignal () at
./build/../src/backend/postmaster/postmaster.c:3776
#16 ServerLoop () at ./build/../src/backend/postmaster/postmaster.c:1669
#17 0x000056354059c7c8 in PostmasterMain (argc=argc(at)entry=5,
argv=argv(at)entry=0x563541b229c0)
at ./build/../src/backend/postmaster/postmaster.c:1374
#18 0x00005635402bf5b1 in main (argc=5, argv=0x563541b229c0) at
./build/../src/backend/main/main.c:199

The destination (subscriber) table has two timestamps "started" and
"closed" with brin index on each of them. Table is partitioned by the
range on the "closed" column. Each partition is splitted on 6
subpartitions via list (remainder of id).

Best regards,
Sergey Belyashov

пн, 17 февр. 2025 г. в 19:39, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>
> PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
> > After some investigations I found that segfault is caused by one type of
> > subscriptions: subscription for huge partitioned tables on publisher and
> > subscriber (via root), subscriptions are created with data_copy=false
> > (source table updated by inserts and partition detaches, and it is huge,
> > data transfer is not compressed so it may take a days). Segfault does not
> > come immediately after subscription creation, but it cause when data is come
> > from the publisher. Then subscriber is restarts, recover, run subscription
> > again, catch segfault and repeat again until subscription is disabled.
>
> This is not enough information for anyone else to reproduce the
> problem; it very likely depends on details that you haven't mentioned.
> Can you create a reproducer case? I'm hoping for a script that sets
> up the necessary tables and subscriptions and populates the tables
> with enough dummy data to cause the failure.
>
> Something that might be less work for you is to get a stack trace
> from the crash:
>
> https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend
>
> However, I make no promises that we can isolate the cause from
> just a stack trace. A reproducer would be much better.
>
> regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Jose Fco. Mojada 2025-02-17 19:18:17 Possible bug ¿? missing "ucol.h" in postgre installation
Previous Message Tom Lane 2025-02-17 16:39:11 Re: BUG #18815: Logical replication worker Segmentation fault

Browse pgsql-hackers by date

  From Date Subject
Next Message Álvaro Herrera 2025-02-17 19:17:23 Re: NOT ENFORCED constraint feature
Previous Message Tom Lane 2025-02-17 19:13:09 Re: Clarification on Role Access Rights to Table Indexes