BUG #17072: Assert for clogGroupNext failed due to a race condition in TransactionGroupUpdateXidStatus()

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: exclusion(at)gmail(dot)com
Subject: BUG #17072: Assert for clogGroupNext failed due to a race condition in TransactionGroupUpdateXidStatus()
Date: 2021-06-24 16:00:01
Message-ID: 17072-2f8764857ef2c92a@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 17072
Logged by: Alexander Lakhin
Email address: exclusion(at)gmail(dot)com
PostgreSQL version: 13.3
Operating system: Ubuntu 20.04
Description:

When running multiple installchecks (100-200) in a loop I get the following
fail:

Core was generated by `postgres: postgres regress119 127.0.0.1(60410) in'.
Program terminated with signal SIGABRT, Aborted.
#0 __GI_raise (sig=sig(at)entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 __GI_raise (sig=sig(at)entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007f1c7203d535 in __GI_abort () at abort.c:79
#2 0x00005645576b41a2 in ExceptionalCondition (
conditionName=conditionName(at)entry=0x564557840fc8
"pg_atomic_read_u32(&MyProc->clogGroupNext) == INVALID_PGPROCNO",
errorType=errorType(at)entry=0x56455771001d "FailedAssertion",
fileName=fileName(at)entry=0x56455774676d "proc.c",
lineNumber=lineNumber(at)entry=498) at assert.c:67
#3 0x000056455757b062 in InitProcess () at proc.c:498
#4 0x00005645575893df in PostgresMain (argc=1,
argv=argv(at)entry=0x5645581d5820, dbname=<optimized out>,
username=0x5645581d5168 "postgres") at postgres.c:4393
#5 0x00005645574e6f91 in BackendRun (port=0x5645581c73d0,
port=0x5645581c73d0) at postmaster.c:5784
#6 BackendStartup (pool=pool(at)entry=0x0, port=port(at)entry=0x5645581c73d0) at
postmaster.c:5440
#7 0x00005645574e769b in ServerLoop () at postmaster.c:2403
#8 0x00005645574e8e4b in PostmasterMain (argc=<optimized out>,
argv=<optimized out>) at postmaster.c:1870
#9 0x00005645571db177 in main (argc=3, argv=0x56455819ad30) at main.c:211

(gdb) print MyProc->clogGroupNext
$1 = {value = 985}
(gdb) print &MyProc->clogGroupNext
$2 = (pg_atomic_uint32 *) 0x7f1c64806a0c

With some diagnostic output added I've got in the server log:
2021-06-24 16:44:16.018 MSK|postgres|regress184|60d48c2d.ea21|WARNING:
TransactionGroupUpdateXidStatus: 819, 2147483647
...
2021-06-24 16:44:16.018 MSK|postgres|regress54|60d48c27.df96|WARNING:
TransactionGroupUpdateXidStatus: 982, 2147483647
2021-06-24 16:44:16.021 MSK|postgres|regress33|60d48c2f.efdc|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c648dd8ec, 974
2021-06-24 16:44:16.018 MSK|postgres|regress44|60d48c2a.e5a7|WARNING:
TransactionGroupUpdateXidStatus: 981, 2147483647
2021-06-24 16:44:16.021 MSK|postgres|regress72|60d48c2d.ea64|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c64880824, 974
2021-06-24 16:44:16.021 MSK|postgres|regress32|60d48c2f.ee50|WARNING:
TransactionGroupUpdateXidStatus: after while: 974
2021-06-24 16:44:16.018 MSK|postgres|regress144|60d48c2e.ebc5|WARNING:
TransactionGroupUpdateXidStatus: 963, 2147483647
2021-06-24 16:44:16.021 MSK|postgres|regress184|60d48c2d.ea21|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c64806a0c, 2147483647
2021-06-24 16:44:16.018 MSK|postgres|regress149|60d48c2a.e674|WARNING:
TransactionGroupUpdateXidStatus: 799, 2147483647
2021-06-24 16:44:16.021 MSK|postgres|regress194|60d48c2f.eed2|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c648610e4, 944
2021-06-24 16:44:16.018 MSK|postgres|regress114|60d48c29.e3e1|WARNING:
TransactionGroupUpdateXidStatus: 816, 2147483647
2021-06-24 16:44:16.021 MSK|postgres|regress33|60d48c2f.efdc|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c648dd8ec, 944
2021-06-24 16:44:16.021 MSK|postgres|regress186|60d48c2f.eea9|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c648e95a4, 2147483647
2021-06-24 16:44:16.018 MSK|postgres|regress147|60d48c2f.efb7|ERROR: cannot
change routine kind
2021-06-24 16:44:16.018 MSK|postgres|regress147|60d48c2f.efb7|DETAIL:
"functest1" is a function.
2021-06-24 16:44:16.021 MSK|postgres|regress72|60d48c2d.ea64|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c64880824, 944
2021-06-24 16:44:16.021 MSK|postgres|regress50|60d48c2b.e710|ERROR:
relation "does_not_exist" does not exist
2021-06-24 16:44:16.020 MSK|postgres|regress22|60d48c2f.efed|WARNING:
TransactionGroupUpdateXidStatus: after while: 2147483647
2021-06-24 16:44:16.021 MSK|postgres|regress54|60d48c27.df96|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c648dc3f4, 2147483647
2021-06-24 16:44:16.020 MSK|postgres|regress36|60d48c2f.eff6|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 INVALID_PGPROCNO:
0x7f1c64828b3c, 2147483647
2021-06-24 16:44:16.021 MSK|postgres|regress34|60d48c2f.ee8b|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c648e02dc, 2147483647
2021-06-24 16:44:16.020 MSK|postgres|regress11|60d48c2f.ef8e|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c647f6e6c, 792
2021-06-24 16:44:16.021 MSK|postgres|regress44|60d48c2a.e5a7|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c648daefc, 2147483647
2021-06-24 16:44:16.021 MSK|postgres|regress184|60d48c2d.ea21|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c64806a0c, 944
2021-06-24 16:44:16.021 MSK|postgres|regress194|60d48c2f.eed2|WARNING:
TransactionGroupUpdateXidStatus: after while: 944
2021-06-24 16:44:16.020 MSK|postgres|regress155|60d48c2f.efe1|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c648ad114, 792
2021-06-24 16:44:16.021 MSK|postgres|regress33|60d48c2f.efdc|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c648dd8ec, 888
2021-06-24 16:44:16.021 MSK|postgres|regress186|60d48c2f.eea9|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c648e95a4, 888
2021-06-24 16:44:16.021 MSK|postgres|regress72|60d48c2d.ea64|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c64880824, 888
2021-06-24 16:44:16.021 MSK|postgres|regress54|60d48c27.df96|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c648dc3f4, 888
2021-06-24 16:44:16.021 MSK|postgres|regress144|60d48c2e.ebc5|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c648c358c, 2147483647
2021-06-24 16:44:16.021 MSK|postgres|regress186|60d48c2f.eea9|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c648e95a4, 985
2021-06-24 16:44:16.021 MSK|postgres|regress34|60d48c2f.ee8b|WARNING:
TransactionGroupUpdateXidStatus: after while: 2147483647
2021-06-24 16:44:16.021 MSK|postgres|regress149|60d48c2a.e674|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c647ec6ac, 2147483647
2021-06-24 16:44:16.021 MSK|postgres|regress44|60d48c2a.e5a7|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c648daefc, 985
2021-06-24 16:44:16.021 MSK|postgres|regress22|60d48c2f.efed|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 INVALID_PGPROCNO:
0x7f1c648610e4, 2147483647
2021-06-24 16:44:16.021 MSK|postgres|regress36|60d48c2f.eff6|WARNING:
TransactionGroupUpdateXidStatus: end
2021-06-24 16:44:16.021 MSK|postgres|regress184|60d48c2d.ea21|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c64806a0c, 985
2021-06-24 16:44:16.021 MSK|postgres|regress114|60d48c29.e3e1|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c64802b24, 2147483647
2021-06-24 16:44:16.021 MSK|postgres|regress11|60d48c2f.ef8e|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c647f6e6c, 985
2021-06-24 16:44:16.021 MSK|postgres|regress33|60d48c2f.efdc|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c648dd8ec, 985
2021-06-24 16:44:16.021 MSK|postgres|regress50|60d48c2b.e710|ERROR: syntax
error at or near "arg" at character 22
2021-06-24 16:44:16.021 MSK|postgres|regress155|60d48c2f.efe1|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c648ad114, 985
2021-06-24 16:44:16.021 MSK|postgres|regress72|60d48c2d.ea64|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c64880824, 985
2021-06-24 16:44:16.021 MSK|postgres|regress54|60d48c27.df96|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 nextidx:
0x7f1c648dc3f4, 985
2021-06-24 16:44:16.021 MSK|postgres|regress186|60d48c2f.eea9|WARNING:
TransactionGroupUpdateXidStatus: after while: 985
2021-06-24 16:44:16.021 MSK|postgres|regress34|60d48c2f.ee8b|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 INVALID_PGPROCNO:
0x7f1c648e95a4, 2147483647
2021-06-24 16:44:16.021 MSK|postgres|regress144|60d48c2e.ebc5|WARNING:
TransactionGroupUpdateXidStatus: after while: 2147483647
2021-06-24 16:44:16.021 MSK|postgres|regress44|60d48c2a.e5a7|WARNING:
TransactionGroupUpdateXidStatus: return false (963)
2021-06-24 16:44:16.021 MSK|postgres|regress22|60d48c2f.efed|WARNING:
TransactionGroupUpdateXidStatus: pg_atomic_write_u32 INVALID_PGPROCNO:
0x7f1c648aa724, 2147483647
2021-06-24 16:44:16.021 MSK|postgres|regress184|60d48c2d.ea21|WARNING:
TransactionGroupUpdateXidStatus: return false (963)
2021-06-24 16:44:16.021 MSK|postgres|regress194|60d48c2f.eed2|WARNING:
TransactionGroupUpdateXidStatus: return true (944)
2021-06-24 16:44:16.021 MSK|postgres|regress11|60d48c2f.ef8e|WARNING:
TransactionGroupUpdateXidStatus: return false (963)
2021-06-24 16:44:16.021 MSK|postgres|regress33|60d48c2f.efdc|WARNING:
TransactionGroupUpdateXidStatus: return false (963)

The offending (the one that leaved a "valid" clogGroupNext) proccess is
60d48c2d.ea21. It looks like it got from the
pg_atomic_compare_exchange_u32() the nextidx value that was written in the
clogGroupFirst by the process 60d48c2e.ebc5, and exited just after that.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Alexander Lakhin 2021-06-24 16:15:00 Re: BUG #17072: Assert for clogGroupNext failed due to a race condition in TransactionGroupUpdateXidStatus()
Previous Message Pawel Kudzia 2021-06-24 13:32:32 Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows