From: | Alvaro Herrera <alvherre(at)commandprompt(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: freezing multixacts |
Date: | 2012-02-06 14:31:20 |
Message-ID: | 1328537689-sup-1841@alvh.no-ip.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Excerpts from Robert Haas's message of jue feb 02 11:24:08 -0300 2012:
> On Wed, Feb 1, 2012 at 11:33 PM, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> > If there's only one remaining member, the problem is easy: replace it
> > with that transaction's xid, and set the appropriate hint bits. But if
> > there's more than one, the only way out is to create a new multi. This
> > increases multixactid consumption, but I don't see any other option.
>
> Why do we need to freeze anything if the transactions are still
> running? We certainly don't freeze regular transaction IDs while the
> transactions are still running; it would give wrong answers. It's
> probably possible to do it for mxids, but why would you need to?
Well, I was thinking that we could continue generating the mxids
continuously and if we didn't freeze the old running ones, we could
overflow. So one way to deal with the problem would be rewriting the
old ones into new ones. But it has occurred to me that instead of doing
that we could simply disallow creation of new ones until the oldest ones
have been closed and removed from tables -- which is more in line with
what we do for Xids anyway.
> Suppose you have a tuple A which is locked by a series of transactions
> T0, T1, T2, ...; AIUI, each new locker is going to have to create a
> new mxid with all the existing entries plus a new one for itself.
> But, unless I'm confused, as it's doing so, it can discard any entries
> for locks taken by transactions which are no longer running.
That's correct. But the problem is a tuple that is locked or updated by
a very old transaction that doesn't commit or rollback, and the tuple is
never locked again. Eventually the Xid could remain live while the mxid
is in wraparound danger.
> So given
> an mxid with living members, any dead member in that mxid must have
> been living at the time the newest member was added. Surely we can't
> be consuming mxids anywhere near fast enough for that to be a problem.
Well, the problem is that while it should be rare to consume mxids as
fast as necessary for this problem to show up, it *is* possible --
unless we add some protection that they are not created until the old
ones are frozen (which now means "removed").
> > However, there are cases where not even that is possible -- consider
> > tuple freezing during WAL recovery. Recovery is going to need to
> > replace those multis with other multis, but it cannot create new multis
> > itself. The only solution here appears to be that when multis are
> > frozen in the master, replacement multis have to be logged too. So the
> > heap_freeze_tuple Xlog record will have a map of old multi to new. That
> > way, recovery can just determine the new multi to use for any particular
> > old multi; since multixact creation is also logged, we're certain that
> > the replacement value has already been defined.
>
> This doesn't sound right. Why would recovery need to create a multi
> that didn't exist on the master? Any multi it applies to a record
> should be one that it was told to apply by the master; and the master
> should have already WAL-logged the creation of that multi. I don't
> think that "replacement" mxids have to be logged; I think that *all*
> mxids have to be logged. Am I all wet?
Well, yeah, all mxids are logged, in particular those that would have
been used for replacement. However I think I've discarded the idea of
replacement altogether now, because it makes simpler both on master and
slave.
--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2012-02-06 14:38:23 | Re: some longer, larger pgbench tests with various performance-related patches |
Previous Message | Lionel Elie Mamane | 2012-02-06 14:31:11 | libpq parallel build |