From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Tomas Vondra <tomas(at)vondra(dot)me>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com> |
Subject: | Re: Snapshot related assert failure on skink |
Date: | 2025-03-23 16:43:05 |
Message-ID: | 697c7314-ed97-491f-815e-0f58064b68a4@iki.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 21/03/2025 17:16, Andres Freund wrote:
> Am I right in understanding that the only scenario (when in
> STANDBY_SNAPSHOT_READY), where ExpireOldKnownAssignedTransactionIds() would
> "legally" remove a transaction, rather than the commit / abort records doing
> so, is if the primary crash-restarted while transactions were ongoing?
>
> Those transactions won't have a commit/abort records and thus won't trigger
> ExpireTreeKnownAssignedTransactionIds(), which otherwise would have updated
> ->xactCompletionCount?
Correct.
> When writing the snapshot caching patch, I tried to make sure that all the
> places that maintain ->latestCompletedXid also update
> ->xactCompletionCount. Afaict that's still the case. Which, I think, means
> that we're also missing calls to MaintainLatestCompletedXidRecovery()?
Yep, I was just looking at that too.
> If latestCompletedXid is incorrect visibility determinations end up wrong...
I think it happens to work, because the transactions are effectively
aborted. latestCompletedXid is used to initialize xmax in
GetSnapshotData. If, for example, latestCompletedXid is incorrectly set
to 1000 even though XID 1001 already aborted, a snapshot with xmax=1000
still correctly considers XID 1001 as "not visible". As soon as a
transaction commits, latestCompletedXid is fixed.
AFAICS we could skip updating latestCompletedXid on aborts altogether
and rename it to latestCommittedXid. But it's hardly worth optimizing
for aborts.
For the same reason, I believe the assertion failure we're discussing
here is also harmless. Even though the reused snapshot has a smaller
xmin than expected, all those transactions aborted and are thus not
visible anyway.
In any case, let's be tidy and fix both latestCompletedXid and
xactCompletionCount.
--
Heikki Linnakangas
Neon (https://neon.tech)
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2025-03-23 16:59:43 | Re: AIO v2.5 |
Previous Message | Noah Misch | 2025-03-23 16:32:48 | Re: AIO v2.5 |