Re: Rework the way multixact truncations work

From: Andres Freund <andres(at)anarazel(dot)de>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Rework the way multixact truncations work
Date: 2015-09-23 18:48:50
Message-ID: 20150923184850.GK1573@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2015-09-23 15:03:05 -0300, Alvaro Herrera wrote:
> The comment on top of TrimMultiXact states that "no locks are needed
> here", but then goes on to grab a few locks.

Hm. Yea. Although that was the case before.

> It's a bit odd that SetMultiXactIdLimit has the "finishedStartup" test
> so low. Why bother setting all those local variables only to bail
> out?

Hm. Doesn't seem to matter much to me, but I can change it.

> In MultiXactAdvanceOldest, the test for sawTruncationinCkptCycle seems
> reversed?
> if (!MultiXactState->sawTruncationInCkptCycle)
> surely we should be doing truncation if it's set?

No, that's correct. If there was a checkpoint cycle where oldestMulti
advanced without seing a truncation record we need to perform a legacy
truncation.

> Honestly, I wonder whether this message
> ereport(LOG,
> (errmsg("performing legacy multixact truncation"),
> errdetail("Legacy truncations are sometimes performed when replaying WAL from an older primary."),
> errhint("Upgrade the primary, it is susceptible to data corruption.")));
> shouldn't rather be a PANIC. (The main reason not to, I think, is that
> once you see this, there is no way to put the standby in a working state
> without recloning).

Huh? The behaviour in that case is still better than what we have in
9.3+ today (not delayed till the restartpoint). Don't see why that
should be a panic. That'd imo make it pretty much impossible to upgrade
a pair of primary/master where you normally upgrade the standby first?

This is all moot given Robert's objection to backpatching this to
9.3/4.

> If the find_multixact_start(oldestMulti) call in TruncateMultiXact
> fails, what recourse does the user have? I wonder if the elog() should
> be a FATAL instead of just LOG. It's not like it would work on a
> subsequent run, is it?

It currently only LOGs, I don't want to change that. The cases where we
currently know it's possible to hit this, it should be fixed by the next
set of emergency autovacuums (which we trigger).

Thanks for the look,

Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2015-09-23 18:52:27 Re: unclear about row-level security USING vs. CHECK
Previous Message Robert Haas 2015-09-23 18:45:02 Re: unclear about row-level security USING vs. CHECK