BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: exclusion(at)gmail(dot)com
Subject: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()
Date: 2023-06-24 13:00:00
Message-ID: 17994-5c7100b51b4790e9@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 17994
Logged by: Alexander Lakhin
Email address: exclusion(at)gmail(dot)com
PostgreSQL version: 16beta1
Operating system: Ubuntu 22.04
Description:

The following script:
for ((n=1;n<=10;n++)); do
echo "ITERATION $n"

createdb test
numclients=50
for ((c=1;c<=$numclients;c++)); do
cat << EOF | psql test -o /dev/null &
CREATE PUBLICATION testpub_$c FOR ALL TABLES;
CREATE TYPE rec_$c AS (f1 text, f2 text);
CREATE TEMP TABLE tbl_$c (f rec_$c);
INSERT INTO tbl_$c VALUES ((repeat('1234567890', 100000), 'a'));

UPDATE tbl_$c SET f.f2 = 'b';
EOF
done
wait
dropdb test

grep 'TRAP:' server.log && break;
done

invokes errors or a server crash:
ITERATION 1
ITERATION 2
ERROR: unsupported byval length: 32639
ITERATION 3
ITERATION 4
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
connection to server was lost
WARNING: terminating connection because of crash of another server
process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
dropdb: error: database removal failed: server closed the connection
unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
TRAP: failed Assert("(thisatt->attalign) == TYPALIGN_SHORT"), File:
"heaptuple.c", Line: 1312, PID: 1737623

The stack trace of the crash:
Core was generated by `postgres: law test [local] UPDATE
'.
Program terminated with signal SIGABRT, Aborted.

warning: Section `.reg-xstate/1737623' in core file too small.
#0 __pthread_kill_implementation (no_tid=0, signo=6,
threadid=140071061268416) at ./nptl/pthread_kill.c:44
44 ./nptl/pthread_kill.c: No such file or directory.
(gdb) bt
#0 __pthread_kill_implementation (no_tid=0, signo=6,
threadid=140071061268416) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=140071061268416) at
./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=140071061268416, signo=signo(at)entry=6) at
./nptl/pthread_kill.c:89
#3 0x00007f64d7c27476 in __GI_raise (sig=sig(at)entry=6) at
../sysdeps/posix/raise.c:26
#4 0x00007f64d7c0d7f3 in __GI_abort () at ./stdlib/abort.c:79
#5 0x0000555ede8dc3b3 in ExceptionalCondition (
conditionName=0x555edeb49700 "(thisatt->attalign) == TYPALIGN_SHORT",
fileName=0x555edeb49020 "heaptuple.c", lineNumber=1312) at assert.c:66
#6 0x0000555edc7b76d7 in heap_deform_tuple (tuple=0x7ffe49cfe570,
tupleDesc=0x7f64ca4562d8,
values=0x62500005ad40, isnull=0x62500005b130) at heaptuple.c:1312
#7 0x0000555edd389d0b in ExecEvalFieldStoreDeForm (state=0x62500005b230,
op=0x62500005b338,
econtext=0x62500005ac20) at execExprInterp.c:3210
#8 0x0000555edd3729b4 in ExecInterpExpr (state=0x62500005b230,
econtext=0x62500005ac20, isnull=0x7ffe49cfe900)
at execExprInterp.c:1446
#9 0x0000555edd3791f2 in ExecInterpExprStillValid (state=0x62500005b230,
econtext=0x62500005ac20,
isNull=0x7ffe49cfe900) at execExprInterp.c:1870
#10 0x0000555edd406e84 in ExecEvalExprSwitchContext (state=0x62500005b230,
econtext=0x62500005ac20,
isNull=0x7ffe49cfe900) at ../../../src/include/executor/executor.h:355
#11 0x0000555edd4070ab in ExecProject (projInfo=0x62500005b228) at
../../../src/include/executor/executor.h:389
#12 0x0000555edd40883e in ExecScan (node=0x62500005ab10,
accessMtd=0x555edd5843cf <SeqNext>,
recheckMtd=0x555edd5847cd <SeqRecheck>) at execScan.c:237
#13 0x0000555edd58484e in ExecSeqScan (pstate=0x62500005ab10) at
nodeSeqscan.c:112
#14 0x0000555edd3f5a71 in ExecProcNodeFirst (node=0x62500005ab10) at
execProcnode.c:464
#15 0x0000555edd5495a5 in ExecProcNode (node=0x62500005ab10) at
../../../src/include/executor/executor.h:273
#16 0x0000555edd5674b3 in ExecModifyTable (pstate=0x62500005a490) at
nodeModifyTable.c:3616
#17 0x0000555edd3f5a71 in ExecProcNodeFirst (node=0x62500005a490) at
execProcnode.c:464
#18 0x0000555edd3af500 in ExecProcNode (node=0x62500005a490) at
../../../src/include/executor/executor.h:273
#19 0x0000555edd3be25a in ExecutePlan (estate=0x62500005a218,
planstate=0x62500005a490,
use_parallel_mode=false, operation=CMD_UPDATE, sendTuples=false,
numberTuples=0,
direction=ForwardScanDirection, dest=0x625000072770, execute_once=true)
at execMain.c:1670
#20 0x0000555edd3b17f2 in standard_ExecutorRun (queryDesc=0x619000001598,
direction=ForwardScanDirection,
count=0, execute_once=true) at execMain.c:365
#21 0x0000555edd3b0e37 in ExecutorRun (queryDesc=0x619000001598,
direction=ForwardScanDirection, count=0,
execute_once=true) at execMain.c:309
#22 0x0000555ede0d688a in ProcessQuery (plan=0x625000006c28,
sourceText=0x625000005218 "UPDATE tbl_14 SET f.f2 = 'b';", params=0x0,
queryEnv=0x0, dest=0x625000072770,
qc=0x7ffe49cff1c0) at pquery.c:160
#23 0x0000555ede0deb02 in PortalRunMulti (portal=0x625000025a18,
isTopLevel=true, setHoldSnapshot=false,
dest=0x625000072770, altdest=0x625000072770, qc=0x7ffe49cff1c0) at
pquery.c:1277
#24 0x0000555ede0dba90 in PortalRun (portal=0x625000025a18,
count=9223372036854775807, isTopLevel=true,
run_once=true, dest=0x625000072770, altdest=0x625000072770,
qc=0x7ffe49cff1c0) at pquery.c:791
#25 0x0000555ede0c592d in exec_simple_query (query_string=0x625000005218
"UPDATE tbl_14 SET f.f2 = 'b';")
at postgres.c:1274
#26 0x0000555ede0d347a in PostgresMain (dbname=0x62900001b358 "test",
username=0x6250000020f8 "law")
at postgres.c:4632
#27 0x0000555eddcf401f in BackendRun (port=0x614000016040) at
postmaster.c:4461
#28 0x0000555eddcf2723 in BackendStartup (port=0x614000016040) at
postmaster.c:4189
#29 0x0000555eddce84dd in ServerLoop () at postmaster.c:1779
#30 0x0000555eddce69fe in PostmasterMain (argc=3, argv=0x6030000006a0) at
postmaster.c:1463
#31 0x0000555edd67850e in main (argc=3, argv=0x6030000006a0) at main.c:198

(gdb) frame 6
#6 0x0000555edc7b76d7 in heap_deform_tuple (tuple=0x7ffe49cfe570,
tupleDesc=0x7f64ca4562d8,
values=0x62500005ad40, isnull=0x62500005b130) at heaptuple.c:1312
1312 off = att_align_nominal(off,
thisatt->attalign);
(gdb) p thisatt->attalign
$1 = 0 '\000'

(gdb) frame 7
#7 0x0000555edd389d0b in ExecEvalFieldStoreDeForm (state=0x62500005b230,
op=0x62500005b338,
econtext=0x62500005ac20) at execExprInterp.c:3210
3210 heap_deform_tuple(&tmptup, tupDesc,
(gdb) p *tupDesc
$2 = {natts = 18406, tdtypeid = 1952409456, tdtypmod = 1953718639,
tdrefcount = 859320671, constr = 0x3338,
attrs = 0x7f64ca4562f0}

With some debug logging added I see (note the natts value):
2023-06-24 15:23:12.488 MSK [1741232] LOG: get_cached_rowtype() after
lookup_type_cache| type_id: 16389, typentry->tupDesc: 0x7ff12f045640,
typentry->tupDesc->natts: 2
2023-06-24 15:23:12.488 MSK [1741232] STATEMENT: UPDATE tbl_5 SET f.f2 =
'b';
2023-06-24 15:23:12.488 MSK [1741293] LOG: statement: CREATE TYPE rec_23 AS
(f1 text, f2 text);
2023-06-24 15:23:12.488 MSK [1741232] LOG: ExecEvalFieldStoreDeForm()
before heap_deform_tuple| tupDesc: 0x7ff12f045640, tupDesc->natts: 16427
2023-06-24 15:23:12.488 MSK [1741232] STATEMENT: UPDATE tbl_5 SET f.f2 =
'b';
TRAP: failed Assert("(thisatt->attalign) == TYPALIGN_SHORT"), File:
"heaptuple.c", Line: 1312, PID: 1741232

So it looks like *tupDesc memory was overridden due to relcache invalidation
during ExecEvalFieldStoreDeForm() execution after the get_cached_rowtype()
call.

The pg_usleep(1000) call added inside ExecEvalFieldStoreDeForm() after
get_cached_rowtype() greatly increases the probability of the crash.

Reproduced on REL_11_STABLE (since 96e38fa5e) .. master.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Joe Conway 2023-06-24 13:09:44 Re: BUG #17946: LC_MONETARY & DO LANGUAGE plperl - BUG
Previous Message PG Bug reporting form 2023-06-24 06:01:25 BUG #17993: FK issue on logical replication table