Re: BUG #18259: Assertion in ExtendBufferedRelLocal() fails after no-space-left condition

From: tender wang <tndrwang(at)gmail(dot)com>
To: exclusion(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18259: Assertion in ExtendBufferedRelLocal() fails after no-space-left condition
Date: 2023-12-26 10:51:50
Message-ID: CAHewXNm4ktpPy=9g7NAYV=BoW-Ao3taZREPTcXnJDZ6P5esiag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Thanks for the report. I can reproduce your reported bug on master. And I
find another assert failed when run below SQL:

psql (17devel)
Type "help" for help.

postgres=# CREATE UNLOGGED TABLE filler(a int, b text STORAGE plain);
CREATE TABLE
postgres=# INSERT INTO filler SELECT g, repeat('x', 1000) FROM
generate_series(1,
postgres(# 50000) g;
INSERT 0 50000
postgres=# CREATE TEMP TABLE tbl(a int);
CREATE TABLE
postgres=# INSERT INTO tbl SELECT g FROM generate_series(1, 200000) g;
ERROR: could not extend file "base/5/t3_16389": No space left on device
HINT: Check free disk space.
postgres=# DROP TABLE filler;
DROP TABLE
postgres=# INSERT INTO tbl SELECT g FROM generate_series(1, 200000) g;
INSERT 0 200000
postgres=# INSERT INTO tbl SELECT g FROM generate_series(1, 200000) g;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

(gdb) bt
#0 __GI_raise (sig=sig(at)entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007f9d3d8b1859 in __GI_abort () at abort.c:79
#2 0x000055f83501c868 in ExceptionalCondition
(conditionName=0x55f8351fcb78 "!(buf_state & (BM_VALID | BM_TAG_VALID |
BM_DIRTY | BM_JUST_DIRTIED))", fileName=0x55f8351fca4b "localbuf.c",
lineNumber=402) at assert.c:66
#3 0x000055f834df05ab in ExtendBufferedRelLocal (bmr=...,
fork=MAIN_FORKNUM, flags=8, extend_by=1, extend_upto=4294967295,
buffers=0x7ffff3ed1530, extended_by=0x7ffff3ed13fc)
at localbuf.c:402
#4 0x000055f834de7a0a in ExtendBufferedRelCommon (bmr=...,
fork=MAIN_FORKNUM, strategy=0x0, flags=8, extend_by=1,
extend_upto=4294967295, buffers=0x7ffff3ed1530, extended_by=0x7ffff3ed14dc)
at bufmgr.c:1828
#5 0x000055f834de6393 in ExtendBufferedRelBy (bmr=..., fork=MAIN_FORKNUM,
strategy=0x0, flags=8, extend_by=1, buffers=0x7ffff3ed1530,
extended_by=0x7ffff3ed14dc) at bufmgr.c:889
#6 0x000055f83492a240 in RelationAddBlocks (relation=0x7f9d325a7648,
bistate=0x0, num_pages=1, use_fsm=true, did_unlock=0x7ffff3ed168d) at
hio.c:342
#7 0x000055f83492ab67 in RelationGetBufferForTuple
(relation=0x7f9d325a7648, len=32, otherBuffer=0, options=0, bistate=0x0,
vmbuffer=0x7ffff3ed1714, vmbuffer_other=0x0, num_pages=1)
at hio.c:768
#8 0x000055f834910840 in heap_insert (relation=0x7f9d325a7648,
tup=0x55f83786e898, cid=0, options=0, bistate=0x0) at heapam.c:1853
#9 0x000055f834920cc0 in heapam_tuple_insert (relation=0x7f9d325a7648,
slot=0x55f83786e808, cid=0, options=0, bistate=0x0) at heapam_handler.c:252
#10 0x000055f834bd582a in table_tuple_insert (rel=0x7f9d325a7648,
slot=0x55f83786e808, cid=0, options=0, bistate=0x0) at
../../../src/include/access/tableam.h:1400
#11 0x000055f834bd7859 in ExecInsert (context=0x7ffff3ed1970,
resultRelInfo=0x55f836fe5ed0, slot=0x55f83786e808, canSetTag=true,
inserted_tuple=0x0, insert_destrel=0x0)
at nodeModifyTable.c:1133
#12 0x000055f834bdbbae in ExecModifyTable (pstate=0x55f836fe5cc0) at
nodeModifyTable.c:3806
#13 0x000055f834b9a6cb in ExecProcNodeFirst (node=0x55f836fe5cc0) at
execProcnode.c:464
#14 0x000055f834b8db69 in ExecProcNode (node=0x55f836fe5cc0) at
../../../src/include/executor/executor.h:273
#15 0x000055f834b9096f in ExecutePlan (estate=0x55f836fe5a30,
planstate=0x55f836fe5cc0, use_parallel_mode=false, operation=CMD_INSERT,
sendTuples=false, numberTuples=0,
direction=ForwardScanDirection, dest=0x55f836ff4378, execute_once=true)
at execMain.c:1670
#16 0x000055f834b8e20f in standard_ExecutorRun (queryDesc=0x55f836f35a20,
direction=ForwardScanDirection, count=0, execute_once=true) at
execMain.c:365
#17 0x000055f834b8e033 in ExecutorRun (queryDesc=0x55f836f35a20,
direction=ForwardScanDirection, count=0, execute_once=true) at
execMain.c:309
#18 0x000055f834e3f27a in ProcessQuery (plan=0x55f836ff4218,
sourceText=0x55f836f0b4b0 "INSERT INTO tbl SELECT g FROM generate_series(1,
200000) g;", params=0x0, queryEnv=0x0,
dest=0x55f836ff4378, qc=0x7ffff3ed1dd0) at pquery.c:160
#19 0x000055f834e40d99 in PortalRunMulti (portal=0x55f836f86a00,
isTopLevel=true, setHoldSnapshot=false, dest=0x55f836ff4378,
altdest=0x55f836ff4378, qc=0x7ffff3ed1dd0) at pquery.c:1277
#20 0x000055f834e402bf in PortalRun (portal=0x55f836f86a00,
count=9223372036854775807, isTopLevel=true, run_once=true,
dest=0x55f836ff4378, altdest=0x55f836ff4378, qc=0x7ffff3ed1dd0)
at pquery.c:791
#21 0x000055f834e39478 in exec_simple_query (query_string=0x55f836f0b4b0
"INSERT INTO tbl SELECT g FROM generate_series(1, 200000) g;") at
postgres.c:1273
#22 0x000055f834e3e105 in PostgresMain (dbname=0x55f836f42870 "postgres",
username=0x55f836f42858 "gpadmin") at postgres.c:4653
#23 0x000055f834d63393 in BackendRun (port=0x55f836f39fd0) at
postmaster.c:4422
#24 0x000055f834d62a4c in BackendStartup (port=0x55f836f39fd0) at
postmaster.c:4101
#25 0x000055f834d5f358 in ServerLoop () at postmaster.c:1769
#26 0x000055f834d5ec7e in PostmasterMain (argc=3, argv=0x55f836f05b80) at
postmaster.c:1468
#27 0x000055f834c1525d in main (argc=3, argv=0x55f836f05b80) at main.c:198

PG Bug reporting form <noreply(at)postgresql(dot)org> 于2023年12月26日周二 17:32写道:

> The following bug has been logged on the website:
>
> Bug reference: 18259
> Logged by: Alexander Lakhin
> Email address: exclusion(at)gmail(dot)com
> PostgreSQL version: 16.1
> Operating system: Ubuntu 22.04
> Description:
>
> The following script:
> mkdir /tmp/100m
> sudo mount -t tmpfs -o size=100M tmpfs /tmp/100m
> export PGDATA=/tmp/100m/tmpdb
>
> initdb
> pg_ctl -l server.log start
>
> cat << 'EOF' | psql
> CREATE UNLOGGED TABLE filler(a int, b text STORAGE plain);
> INSERT INTO filler SELECT g, repeat('x', 1000) FROM generate_series(1,
> 50000) g;
> CREATE TEMP TABLE tbl(a int);
> INSERT INTO tbl SELECT g FROM generate_series(1, 200000) g;
> INSERT INTO tbl SELECT g FROM generate_series(1, 200000) g;
> DROP TABLE filler;
> INSERT INTO tbl SELECT g from generate_series(1, 200000) g;
> EOF
>
> triggers an assertion failure following "no space left" errors:
> ...
> CREATE TABLE
> ERROR: could not extend file "base/5/t3_16391": No space left on device
> HINT: Check free disk space.
> ERROR: could not extend file "base/5/t3_16391": No space left on device
> HINT: Check free disk space.
> DROP TABLE
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> connection to server was lost
> TRAP: failed Assert("buf_state & BM_TAG_VALID"), File: "localbuf.c", Line:
> 390, PID: 25978
>
> The call stack of the failure is:
> ExtendBufferedRelLocal at localbuf.c:391:4
> ExtendBufferedRelCommon at bufmgr.c:1801:17
> ExtendBufferedRelBy at bufmgr.c:862:9
> RelationAddBlocks at hio.c:342:16
> RelationGetBufferForTuple at hio.c:768:11
> heap_insert at heapam.c:1862:11
> heapam_tuple_insert at heapam_handler.c:253:2
> table_tuple_insert at tableam.h:1402:1
> ExecInsert at nodeModifyTable.c:1138:21
> ExecModifyTable at nodeModifyTable.c:3810:12
> ExecProcNodeFirst at execProcnode.c:465:1
> ExecProcNode at executor.h:274:1
> ExecutePlan at execMain.c:1670:10
> standard_ExecutorRun at execMain.c:365:3
> ExecutorRun at execMain.c:310:1
> ProcessQuery at pquery.c:165:5
> PortalRunMulti at pquery.c:1277:5
> PortalRun at pquery.c:795:5
> exec_simple_query at postgres.c:1274:10
> PostgresMain at postgres.c:4641:27
> ExitPostmaster at postmaster.c:5047:1
> BackendStartup at postmaster.c:4196:5
> ServerLoop at postmaster.c:1788:6
> PostmasterMain at postmaster.c:1466:11
>
> The first bad commit for this anomaly is 31966b15 (and exactly that commit
> added the Assert).
>
> With debug logging added in this code within ExtendBufferedRelLocal():
> if (found)
> {
> BufferDesc *existing_hdr =
> GetLocalBufferDescriptor(hresult->id);
> uint32 buf_state;
>
> UnpinLocalBuffer(BufferDescriptorGetBuffer(victim_buf_hdr));
>
> existing_hdr = GetLocalBufferDescriptor(hresult->id);
> PinLocalBuffer(existing_hdr, false);
> buffers[i] = BufferDescriptorGetBuffer(existing_hdr);
>
> buf_state = pg_atomic_read_u32(&existing_hdr->state);
> Assert(buf_state & BM_TAG_VALID);
> Assert(!(buf_state & BM_DIRTY));
> buf_state &= BM_VALID;
> pg_atomic_unlocked_write_u32(&existing_hdr->state, buf_state);
> ...
> I see that it reached for the second INSERT (and NOSPC error) with
> existing_hdr->state == 0x2040000, but for the third INSERT I observe
> state == 0x0.
>
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Peter Eisentraut 2023-12-26 14:42:30 Re: BUG #18252: Assert in CheckOpSlotCompatibility() fails when recursive union filters tuples in non-recursive term
Previous Message happygo 2023-12-26 09:12:25 Re: BUG #18253: aarch64 oel 7 repomd.xml: [Errno 14] HTTPS Error 404