From: | Tomas Vondra <tomas(at)vondra(dot)me> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Assert(TransactionIdPrecedesOrEquals(TransactionXmin, RecentXmin)); |
Date: | 2025-03-15 17:49:32 |
Message-ID: | 3aee3535-184d-423e-a5eb-05161e8b5617@vondra.me |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
while stress-testing a physical replication to test another patch, I've
repeatedly hit this assert in GetSnapshotDataReuse:
Assert(TransactionIdPrecedesOrEquals(TransactionXmin, RecentXmin));
AFAICS the backtrace always looks exactly the same - a couple frames
from the top:
#4 0x0000000000b5af36 in ExceptionalCondition (conditionName=0xe2a470
"TransactionIdPrecedesOrEquals(TransactionXmin, RecentXmin)",
fileName=0xe29768 "procarray.c",
lineNumber=2132) at assert.c:66
#5 0x000000000094c2c3 in GetSnapshotDataReuse (snapshot=0x1079380
<CatalogSnapshotData>) at procarray.c:2132
#6 0x000000000094c4b0 in GetSnapshotData (snapshot=0x1079380
<CatalogSnapshotData>) at procarray.c:2233
#7 0x0000000000bb31f8 in GetNonHistoricCatalogSnapshot (relid=2615) at
snapmgr.c:412
#8 0x0000000000bb31a2 in GetCatalogSnapshot (relid=2615) at snapmgr.c:385
#9 0x0000000000493073 in systable_beginscan
(heapRelation=0x76cee3dc01d8, indexId=2684, indexOK=true, snapshot=0x0,
nkeys=1, key=0x7ffcaddd2f30) at genam.c:414
#10 0x0000000000b383c0 in SearchCatCacheMiss (cache=0xa99a880, nkeys=1,
hashValue=2798280003, hashIndex=3, v1=130630962093252, v2=0, v3=0, v4=0)
at catcache.c:1533
#11 0x0000000000b38290 in SearchCatCacheInternal (cache=0xa99a880,
nkeys=1, v1=130630962093252, v2=0, v3=0, v4=0) at catcache.c:1464
#12 0x0000000000b37f69 in SearchCatCache (cache=0xa99a880,
v1=130630962093252, v2=0, v3=0, v4=0) at catcache.c:1318
#13 0x0000000000b53df5 in SearchSysCache (cacheId=37,
key1=130630962093252, key2=0, key3=0, key4=0) at syscache.c:217
#14 0x0000000000b54318 in GetSysCacheOid (cacheId=37, oidcol=1,
key1=130630962093252, key2=0, key3=0, key4=0) at syscache.c:459
At first I assumed it's something my patch broke, but then I tried the
stress test on master, and I hit that assert again, so it's a
pre-existing issue.
The stress test is fairly simple - it creates a primary/standby cluster,
and then does pgbench on both nodes (read-write on primary, read-only on
standby), and also randomly restarts both nodes (in fast or immediate
modes). The script I use is attached.
It takes a while to hit it - on my laptop it takes an hour or so, but I
guess it's more about the random sleeps in the script.
I've only ever seen this on the standby, never on the primary.
regards
--
Tomas Vondra
Attachment | Content-Type | Size |
---|---|---|
backtrace.txt | text/plain | 5.3 KB |
test2.sh | application/x-shellscript | 6.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Davis | 2025-03-15 17:49:44 | Re: Update Unicode data to Unicode 16.0.0 |
Previous Message | Tomas Vondra | 2025-03-15 17:32:57 | Re: Changing the state of data checksums in a running cluster |