Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts
Date: 2014-07-20 22:02:53
Message-ID: 20140720220253.GE5974@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 2014-07-20 17:43:04 -0400, Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > On 2014-07-20 17:22:48 -0400, Tom Lane wrote:
> >> 4. The patch Bruce applied to initialize datminmxid/relminmxid to the old
> >> NextMultiXactId rather than 1 does not fundamentally change anything here.
> >> It narrows the window in which wraparound can cause problems, but only by
> >> the distance that "1" is in-the-future at the time of upgrade.
>
> > I think it's actually more than that. Consider what happens if
> > pg_upgrade has used pg_resetxlog to set nextMulti to > 2^31. If
> > rel/datminmxid are set to 1 regardless vac_update_relstats() and
> > vac_update_datfrozenxid() won't increase them anymore because of:
> > /* relminmxid must never go backward, either */
> > if (MultiXactIdIsValid(minmulti) &&
> > MultiXactIdPrecedes(pgcform->relminmxid, minmulti))
> > {
> > pgcform->relminmxid = minmulti;
> > dirty = true;
> > }
>
> > And that can actually cause significant problems once 9.3+ creates new
> > multis because they'll never get vacuumed away but still do get
> > truncated. If it's an updating multi xmax that can effectively make the
> > row unreadable - not just block updates.
>
> No, I don't think so. Truncation is driven off oldestMultiXid from
> pg_control, not from relminmxid. The only thing in-the-future values of
> those will do to us is prevent autovacuum from thinking it must do a full
> table scan. (In particular, in-the-future values do not cause
> oldestMultiXid to get advanced, because we're always looking for the
> oldest value not the newest.)

Right. But that's the problem. If oldestMulti is set to, say, 3000000000
by pg_resetxlog during pg_upgrade but *minmxid = 1 those tables won't be
full tables scanned because of multixacts. But vac_truncate_clog() will
SetMultiXactIdLimit(minMulti, minmulti_datoid);
regardless.

Note that it'll not notice the limit of other databases in this case
because vac_truncate_clog() will effectively use the in memory
GetOldestMultiXactId() and check if other databases are before that. But
there won't be any because they all appear in the future. Due to that
the next checkpoint will truncate the clog to the cutoff multi xid used
by the last vacuum.

Am I missing something?

> But in any case, we both agree that setting relminmxid to equal nextMulti
> is completely unsafe in a 9.3 cluster that's already been up. So the
> proposed fix instructions are certainly wrong.

Right. I'm pondering what to do about it instead. The best idea I have
is something like:
1) Jot down pg_controldata|grep NextMultiXactId
2) kill/wait for all existing transactions to end
3) vacuum all databases with vacuum_multixact_freeze_min_age=0. That'll
get rid of all old appearing multis
4) Update pg_class to set relminmxid=value from 1), same with
pg_database

But that sucks and doesn't deal with all the problems :(

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2014-07-20 22:16:51 Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts
Previous Message Tom Lane 2014-07-20 21:43:04 Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts