BUG #18158: Assert in pgstat_report_stat() fails when a backend shutting down with stats pending

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: exclusion(at)gmail(dot)com
Subject: BUG #18158: Assert in pgstat_report_stat() fails when a backend shutting down with stats pending
Date: 2023-10-16 09:00:00
Message-ID: 18158-88f667028dbc7e7b@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 18158
Logged by: Alexander Lakhin
Email address: exclusion(at)gmail(dot)com
PostgreSQL version: 16.0
Operating system: Ubuntu 22.04
Description:

When running check-world in a loop on a rather slow device, I observed
an interesting case. All tests passed successfully, but in the system
journal I found a message about postgres crash during the run.
I reproduced the issue and found in src/test/recovery/tmp_check/log/
012_subtransactions_primary.log:
TRAP: failed Assert("!pgStatLocal.shmem->is_shutdown"), File: "pgstat.c",
Line: 616, PID: 23770

My analysis shows that this Assert failed in a backend (walsender, in my
case) when it has pending stats entries on a server shutdown.
In absence of such entries pgstat_report_stats() exits before the Assert:
/* Don't expend a clock check if nothing to do */
if (dlist_is_empty(&pgStatPending) &&
!have_iostats &&
!have_slrustats &&
!pgstat_have_pending_wal())
{
Assert(pending_since == 0);
return 0;
}
/*
* There should never be stats to report once stats are shut down. Can't
* assert that before the checks above, as there is an unconditional
* pgstat_report_stat() call in pgstat_shutdown_hook() - which at least
* the process that ran pgstat_before_server_shutdown() will still call.
*/
Assert(!pgStatLocal.shmem->is_shutdown);

With the following modification in pgstat_flush_pending_entries():
+if (nowait && (rand() % 10 == 0))
+ did_flush = false;
+else
+{
did_flush = kind_info->flush_pending_cb(entry_ref, nowait);
+}

the issue reproduced easily:
make -s check -C src/test/recovery/ PROVE_TESTS="t/012_subtransactions.pl"
grep TRAP -r src/test/recovery/tmp_check/log

# +++ tap check in src/test/recovery +++
t/012_subtransactions.pl .. ok
All tests successful.
Files=1, Tests=12, 3 wallclock secs ( 0.01 usr 0.00 sys + 0.19 cusr 0.27
csys = 0.47 CPU)
Result: PASS
src/test/recovery/tmp_check/log/012_subtransactions_primary.log:
TRAP: failed Assert("!pgStatLocal.shmem->is_shutdown"), File: "pgstat.c",
Line: 616, PID: 2410126

The stack trace from the log:
postgres: primary: walsender law [local] streaming
0/3019C78(ExceptionalCondition+0xbb)[0x5616c7b58b89]
postgres: primary: walsender law [local] streaming
0/3019C78(pgstat_report_stat+0xef)[0x5616c7991ff1]
postgres: primary: walsender law [local] streaming
0/3019C78(+0x6a9e57)[0x5616c7991e57]
postgres: primary: walsender law [local] streaming
0/3019C78(shmem_exit+0xa6)[0x5616c792d852]
postgres: primary: walsender law [local] streaming
0/3019C78(+0x6456f6)[0x5616c792d6f6]
postgres: primary: walsender law [local] streaming
0/3019C78(proc_exit+0x19)[0x5616c792d63a]
postgres: primary: walsender law [local] streaming
0/3019C78(+0x607cec)[0x5616c78efcec]
postgres: primary: walsender law [local] streaming
0/3019C78(+0x606e47)[0x5616c78eee47]
postgres: primary: walsender law [local] streaming
0/3019C78(+0x6043c8)[0x5616c78ec3c8]
postgres: primary: walsender law [local] streaming
0/3019C78(exec_replication_command+0x4cc)[0x5616c78edfa5]
postgres: primary: walsender law [local] streaming
0/3019C78(PostgresMain+0x7df)[0x5616c796d54b]
...

Browse pgsql-bugs by date

  From Date Subject
Next Message Richard Guo 2023-10-16 11:02:29 Assert failure when CREATE TEMP TABLE
Previous Message Richard Guo 2023-10-16 06:46:43 Re: BUG #18152: Join condition is not pushed down to union all subquery