Re: MultiXactId error after upgrade to 9.3.4

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MultiXactId error after upgrade to 9.3.4
Date: 2014-03-30 18:53:49
Message-ID: 20140330185349.GD4582@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

* Stephen Frost (sfrost(at)snowman(dot)net) wrote:
> I have the pre-upgrade database and can upgrade/rollback/etc that pretty
> easily. Note that the table contents weren't changed during the
> upgrade, of course, and so the 9.2.6 instance has HEAP_XMAX_IS_MULTI set
> while t_xmax is 6849409 for the tuple in question- even though
> pg_controldata reports NextMultiXactId as 1601462 (and it seems very
> unlikely that there's been a wraparound on that in this database..).

Further review leads me to notice that both HEAP_XMAX_IS_MULTI and
HEAP_XMAX_INVALID are set:

t_infomask | 6528

6528 decimal -> 0x1980

0001 1001 1000 0000

Which gives us:

0000 0000 1000 0000 - HEAP_XMAX_LOCK_ONLY
0000 0001 0000 0000 - HEAP_XMIN_COMMITTED
0000 1000 0000 0000 - HEAP_XMAX_INVALID
0001 0000 0000 0000 - HEAP_XMAX_IS_MULTI

Which shows that both HEAP_XMAX_INVALID and HEAP_XMAX_IS_MULTI are set.
Of some interest is that HEAP_XMAX_LOCK_ONLY is also set..

> Perhaps something screwed up xmax/HEAP_XMAX_IS_MULTI under 9.2 and the
> 9.3 instance now detects that something is wrong? Or is this a case
> which was previously allowed and it's just in 9.3 that we don't like it?

The 'improve concurrency of FK locking' patch included a change to
UpdateXmaxHintBits():

- * [...] Hence callers should look
- * only at XMAX_INVALID.

...

+ * Hence callers should look only at XMAX_INVALID.
+ *
+ * Note this is not allowed for tuples whose xmax is a multixact.

[...]

+ Assert(!(tuple->t_infomask & HEAP_XMAX_IS_MULTI));

What isn't clear to me is if this restriction was supposed to always be
there and something pre-9.3 screwed this up, or if this is a *new*
restriction on what's allowed, in which case it's an on-disk format
change that needs to be accounted for.

One other thing to mention is that this system was originally a 9.0
system and the last update to this tuple that we believe happened was
when it was on 9.0, prior to the 9.2 upgrade (which happened about a
year ago), so it's possible the issue is from the 9.0 era.

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christoph Berg 2014-03-30 19:52:26 Re: Securing "make check" (CVE-2014-0067)
Previous Message Stephen Frost 2014-03-30 14:11:35 Re: MultiXactId error after upgrade to 9.3.4