Re: ERROR: cannot freeze committed xmax

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Sasha Aliashkevich <olsender(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: ERROR: cannot freeze committed xmax
Date: 2021-07-15 13:41:18
Message-ID: 202107151341.z63mfqvq3vrs@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

One thing I forgot is that these XIDs are fairly old, perhaps dating
back to when this database was freshly initdb'd if there has been no XID
wraparound. In that case you were probably running a version much older
than 10.14 when they were written. Do you happen to know when did you
initdb this, with what version, when did you upgrade this to 10.14?
That may help search the commit log for bugfixes that might explain the
bug. I just remembered this one as my favorite candidate:

Author: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Branch: master Release: REL_11_BR [d2599ecfc] 2018-05-04 18:24:45 -0300
Branch: REL_10_STABLE Release: REL_10_4 [e1d634758] 2018-05-04 18:23:58 -0300
Branch: REL9_6_STABLE Release: REL9_6_9 [3a11485a5] 2018-05-04 18:23:30 -0300

Don't mark pages all-visible spuriously

Dan Wood diagnosed a long-standing problem that pages containing tuples
that are locked by multixacts containing live lockers may spuriously end
up as candidates for getting their all-visible flag set. This has the
long-term effect that multixacts remain unfrozen; this may previously
pass undetected, but since commit XYZ it would be reported as
"ERROR: found multixact 134100944 from before relminmxid 192042633"
because when a later vacuum tries to freeze the page it detects that a
multixact that should have gotten frozen, wasn't.

Dan proposed a (correct) patch that simply sets a variable to its
correct value, after a bogus initialization. But, per discussion, it
seems better coding to avoid the bogus initializations altogether, since
they could give rise to more bugs later. Therefore this fix rewrites
the logic a little bit to avoid depending on the bogus initializations.

This bug was part of a family introduced in 9.6 by commit a892234f830e;
later, commit 38e9f90a227d fixed most of them, but this one was
unnoticed.

Authors: Dan Wood, Pavan Deolasee, Álvaro Herrera
Reviewed-by: Masahiko Sawada, Pavan Deolasee, Álvaro Herrera
Discussion: https://postgr.es/m/84EBAC55-F06D-4FBE-A3F3-8BDA093CE3E3@amazon.com

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"El número de instalaciones de UNIX se ha elevado a 10,
y se espera que este número aumente" (UPM, 1972)

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ben Chobot 2021-07-15 14:07:12 Re: looping over multirange segments?
Previous Message Alvaro Herrera 2021-07-15 13:36:27 Re: ERROR: cannot freeze committed xmax