BUG #18893: Segfault during analyze pg_database

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: tharakan(at)gmail(dot)com
Subject: BUG #18893: Segfault during analyze pg_database
Date: 2025-04-12 03:34:59
Message-ID: 18893-da17531047e6447f@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 18893
Logged by: Robins Tharakan
Email address: tharakan(at)gmail(dot)com
PostgreSQL version: Unsupported/Unknown
Operating system: Ubuntu
Description:

Creating a few Databases followed by CHECKPOINT causes a segfault.

Tested on a recent - 847bbb21f8c4eb0e2b47417684ad2ba9255c9e80.

Backtrace below but to add, every time I stepped on this, postgres was
always analyzing pg_database.

Repro (a few runs may be required)
=====
-- seq 1 100 | xargs -i psql -Atq -c "DROP DATABASE t{};" postgres
seq 1 100 | xargs -i psql -Atq -c "CREATE DATABASE t{};" postgres
psql -Atq -c "CHECKPOINT" postgres

Error Log (for multiple crashes)
=========
$ tail -10000 logfile | grep "Failed process was running"
2025-04-12 07:23:55.096 ACST [2833183] DETAIL: Failed process was running:
autovacuum: VACUUM ANALYZE pg_catalog.pg_database
2025-04-12 07:24:55.634 ACST [2833183] DETAIL: Failed process was running:
autovacuum: ANALYZE pg_catalog.pg_database
2025-04-12 07:31:02.634 ACST [2833183] DETAIL: Failed process was running:
autovacuum: VACUUM ANALYZE pg_catalog.pg_database
2025-04-12 11:59:31.411 ACST [2845956] DETAIL: Failed process was running:
autovacuum: ANALYZE pg_catalog.pg_database
2025-04-12 12:13:09.974 ACST [2846810] DETAIL: Failed process was running:
autovacuum: VACUUM ANALYZE pg_catalog.pg_database
2025-04-12 12:38:07.432 ACST [2846810] DETAIL: Failed process was running:
autovacuum: VACUUM ANALYZE pg_catalog.pg_database
2025-04-12 12:41:42.729 ACST [2846810] DETAIL: Failed process was running:
autovacuum: VACUUM ANALYZE pg_catalog.pg_database
2025-04-12 12:43:13.276 ACST [2846810] DETAIL: Failed process was running:
autovacuum: VACUUM ANALYZE pg_catalog.pg_database

Error Log (for 1 crash)
=========
2025-04-12 12:43:03.279 ACST [2849996] LOG: checkpoint starting: immediate
force wait
2025-04-12 12:43:13.276 ACST [2846810] LOG: autovacuum worker (PID 2851288)
was terminated by signal 11: Segmentation fault
2025-04-12 12:43:13.276 ACST [2846810] DETAIL: Failed process was running:
autovacuum: VACUUM ANALYZE pg_catalog.pg_database
2025-04-12 12:43:13.276 ACST [2846810] LOG: terminating any other active
server processes
2025-04-12 12:43:13.280 ACST [2846810] LOG: all server processes
terminated; reinitializing
2025-04-12 12:43:13.346 ACST [2851293] LOG: database system was
interrupted; last known up at 2025-04-12 12:42:59 ACST
2025-04-12 12:43:23.175 ACST [2851293] LOG: database system was not
properly shut down; automatic recovery in progress
2025-04-12 12:43:23.196 ACST [2851293] LOG: redo starts at 0/BB5A2BE0
2025-04-12 12:43:23.197 ACST [2851293] WARNING: could not open directory
"base/49251": No such file or directory
2025-04-12 12:43:23.197 ACST [2851293] CONTEXT: WAL redo at 0/BB5A2CB0 for
Database/DROP: dir 1663/49251
2025-04-12 12:43:23.197 ACST [2851293] WARNING: some useless files may be
left behind in old database directory "base/49251"
2025-04-12 12:43:23.197 ACST [2851293] CONTEXT: WAL redo at 0/BB5A2CB0 for
Database/DROP: dir 1663/49251
2025-04-12 12:43:24.620 ACST [2851293] LOG: unexpected pageaddr 0/A6D3A000
in WAL segment 0000000100000000000000D5, LSN 0/D5D3A000, offset 13869056
2025-04-12 12:43:24.620 ACST [2851293] LOG: redo done at 0/D5D39198 system
usage: CPU: user: 0.88 s, system: 0.07 s, elapsed: 1.42 s
2025-04-12 12:43:24.633 ACST [2851294] LOG: checkpoint starting:
end-of-recovery immediate wait
2025-04-12 12:43:44.451 ACST [2851294] LOG: checkpoint complete: wrote
16284 buffers (99.4%), wrote 3 SLRU buffers; 0 WAL file(s) added, 0 removed,
26 recycled; write=0.173 s, sync=19.592 s, total=19.820 s; sync files=29806,
longest=0.019 s, average=0.001 s; distance=433757 kB, estimate=433757 kB;
lsn=0/D5D3A048, redo lsn=0/D5D3A048
2025-04-12 12:43:44.467 ACST [2846810] LOG: database system is ready to
accept connections

SQL Output
==========
postgres=# checkpoint;
WARNING: terminating connection because of crash of another server
process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.
Time: 3485.895 ms (00:03.486)
!?>

Backtrace
=========
(gdb) bt
#0 PopActiveSnapshot () at snapmgr.c:766
#1 0x0000559978e4aff5 in vacuum (relations=0x55999bb4f510,
params=0x55999bb48120, bstrategy=0x55999bb42880, vac_context=0x55999bb4f3c0,
isTopLevel=true) at vacuum.c:611
#2 0x000055997905242c in autovacuum_do_vac_analyze (tab=0x55999bb48118,
bstrategy=0x55999bb42880) at autovacuum.c:3160
#3 0x0000559979051164 in do_autovacuum () at autovacuum.c:2439
#4 0x000055997904fd05 in AutoVacWorkerMain (startup_data=0x0,
startup_data_len=0) at autovacuum.c:1594
#5 0x0000559979056ab7 in postmaster_child_launch
(child_type=B_AUTOVAC_WORKER, child_slot=2022, startup_data=0x0,
startup_data_len=0, client_sock=0x0) at launch_backend.c:290
#6 0x000055997905da7e in StartChildProcess (type=B_AUTOVAC_WORKER) at
postmaster.c:3973
#7 0x000055997905dc0d in StartAutovacuumWorker () at postmaster.c:4037
#8 0x000055997905d6ce in process_pm_pmsignal () at postmaster.c:3794
#9 0x000055997905a803 in ServerLoop () at postmaster.c:1695
#10 0x000055997905a1d2 in PostmasterMain (argc=3, argv=0x55999ba24f80) at
postmaster.c:1400
#11 0x0000559978f021f3 in main (argc=3, argv=0x55999ba24f80) at main.c:227

Backtrace Full
==============
#0 PopActiveSnapshot () at snapmgr.c:766
newstack = 0x55999bb4f3c0
#1 0x0000559978e4aff5 in vacuum (relations=0x55999bb4f510,
params=0x55999bb48120, bstrategy=0x55999bb42880, vac_context=0x55999bb4f3c0,
isTopLevel=true) at vacuum.c:611
in_vacuum = false
stmttype = 0x55997951e3d0 "VACUUM"
in_outer_xact = false
use_own_xacts = true
__func__ = "vacuum"
#2 0x000055997905242c in autovacuum_do_vac_analyze (tab=0x55999bb48118,
bstrategy=0x55999bb42880) at autovacuum.c:3160
rangevar = 0x55999bb4d4b0
rel = 0x55999bb4d500
rel_list = 0x55999bb4d530
vac_context = 0x55999bb4f3c0
#3 0x0000559979051164 in do_autovacuum () at autovacuum.c:2439
_save_exception_stack = 0x7ffef3180850
_save_context_stack = 0x0
_local_sigjmp_buf = {{__jmpbuf = {140732976861944,
2786496174943352778, 0, 140732976861976, 94117656164696, 139965642006560,
2786496174997878730, 8242866857011034058},
__mask_was_saved = 0, __saved_mask = {__val = {5460319232,
94118230406032, 6656, 94117652561175, 94118230399168, 16, 94117649680895,
26, 6240, 94118230406064,
94117652046186, 6656, 94118230399408, 140732976858672,
94117652048505, 0}}}}
_do_rethrow = false
tab = 0x55999bb48118
skipit = false
iter = {cur = 0x7f4c4571d828, end = 0x7f4c4571d828}
relid = 1262
classTup = 0x7f4c47393e18
isshared = true
cell__state = {l = 0x55999bb47b38, i = 0}
classRel = 0x7f4c494eaa88
tuple = 0x0
relScan = 0x55999bb42470
dbForm = 0x7f4c47392d80
table_oids = 0x55999bb47b38
orphan_oids = 0x0
ctl = {num_partitions = 0, ssize = 0, dsize = 140732976858832,
max_dsize = 94117644661494, keysize = 4, entrysize = 104, hash =
0x5599797a6b00 <TopTransactionStateData>,
match = 0x79361810, keycopy = 0x3f, alloc = 0x7ffef31812f8, hcxt =
0x7ffef3180710, hctl = 0x559978c6ff8e
<CommitTransactionCommandInternal+177>}
table_toast_map = 0x55999bb43470
cell = 0x55999bb47b50
bstrategy = 0x55999bb42880
key = {sk_flags = 0, sk_attno = 18, sk_strategy = 3, sk_subtype = 0,
sk_collation = 950, sk_func = {fn_addr = 0x5599791aecf4 <chareq>, fn_oid =
61, fn_nargs = 2,
fn_strict = true, fn_retset = false, fn_stats = 2 '\002',
fn_extra = 0x0, fn_mcxt = 0x55999bb41360, fn_expr = 0x0}, sk_argument =
116}
pg_class_desc = 0x55999bb41460
effective_multixact_freeze_max_age = 400000000
did_vacuum = false
found_concurrent_worker = false
i = 21913
__func__ = "do_autovacuum"
#4 0x000055997904fd05 in AutoVacWorkerMain (startup_data=0x0,
startup_data_len=0) at autovacuum.c:1594
dbname =
"template1\000\000\000\000\000\000\000p\030\000\000\000\000\000\000\0002os\276C\025C\200\000\000\000\000\000\000\000m\271<y\231U\000\000O\267<y\231U\000\000\0002os\036\000\000"
local_sigjmp_buf = {{__jmpbuf = {140732976861944,
2786496174865758154, 0, 140732976861976, 94117656164696, 139965642006560,
2786496174828009418, 8242866840620218314},
__mask_was_saved = 1, __saved_mask = {__val =
{18446744066192964099, 11214622847848677400, 139965631788948,
140732976859328, 4833844260311609856, 16, 140732976859424,
140732976859360, 4833844260311609856, 0, 139965642011360, 1,
94117649519579, 140732976861944, 94118229463072, 140732976859424}}}}
dbid = 1
__func__ = "AutoVacWorkerMain"
#5 0x0000559979056ab7 in postmaster_child_launch
(child_type=B_AUTOVAC_WORKER, child_slot=2022, startup_data=0x0,
startup_data_len=0, client_sock=0x0) at launch_backend.c:290
pid = 0
#6 0x000055997905da7e in StartChildProcess (type=B_AUTOVAC_WORKER) at
postmaster.c:3973
pmchild = 0x55999bab4528
pid = 32766
__func__ = "StartChildProcess"
#7 0x000055997905dc0d in StartAutovacuumWorker () at postmaster.c:4037
bn = 0x5000097e7
#8 0x000055997905d6ce in process_pm_pmsignal () at postmaster.c:3794
request_state_update = false
__func__ = "process_pm_pmsignal"

-
robins
https://robins.in

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2025-04-12 05:33:07 Re: BUG #18831: Particular queries using gin-indexes are not interruptible, resulting is resource usage concerns.
Previous Message Vinod Sridharan 2025-04-12 03:30:43 Re: BUG #18831: Particular queries using gin-indexes are not interruptible, resulting is resource usage concerns.