From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Steve Kehlet <steve(dot)kehlet(at)gmail(dot)com>, Forums postgresql <pgsql-general(at)postgresql(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1 |
Date: | 2015-06-08 13:15:04 |
Message-ID: | 20150608131504.GH24997@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers |
On 2015-06-05 16:56:18 -0400, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On June 5, 2015 10:02:37 PM GMT+02:00, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >> I think we would be foolish to rush that part into the tree. We
> >> probably got here in the first place by rushing the last round of
> >> fixes too much; let's try not to double down on that mistake.
>
> > My problem with that approach is that I think the code has gotten significantly more complex in the least few weeks. I have very little trust that the interactions between vacuum, the deferred truncations in the checkpointer, the state management in shared memory and recovery are correct. There's just too many non-local subtleties here.
>
> > I don't know what the right thing to do here is.
>
> My gut feeling is that rushing to make a release date is the wrong thing.
>
> If we have confidence that we can ship something on Monday that is
> materially more trustworthy than the current releases, then let's aim to
> do that; but let's ship only patches we are confident in. We can do
> another set of releases later that incorporate additional fixes. (As some
> wise man once said, there's always another bug.)
I've tortured hardware a fair bit with HEAD. So far it looks much better
than 9.4.2+ et al. I've noticed a bunch of, to me at least, new issues:
1) the autovacuum trigger logic isn't perfect yet. I.e. especially with
autovacuum=off you can get into situations where emergency vacuums
aren't started when necessary. This is particularly likely to happen
if either very large multixacts are used, or if the server has been
shut down while emergency autovacuum where happening. No corruption
ensues, but it's not easy to get out of.
2) I've managed to corrupt a cluster when a standby performed
restartpoints less frequently than the master performed
checkpoints. Because truncations happen in the checkpointer it's not
that hard to end up with entirely full multixact slrus. This is a
problem on several fronts. We can IIUC end up truncating away the
wrong data, and we can be in a bad state upon promotion. None of that
is new.
3) It's really confusing that truncation (and thus the limits in shared
memory) happens in checkpoints. If you hit a limit and manually do all
the necessary vacuums you'll see a "good" limit in
pg_database.datminmxid, but you'll still into the error. You manually
have to force a checkpoint for the truncation to actually
happen. That's particularly problematic because larger installations,
where I presume wraparound issues are more likely, often have a large
checkpoint_timeout setting.
Since none of these are really new, I don't think they should prevent us
from doing a back branch release. While I'm still not convinced we're
better of with 9.4.4 than with 9.4.1, we're certainly better of than
with 9.4.[23] et al.
If we want to go ahead with the release I plan to do a bit more testing
today and tomorrow. If not I'm first going to continue working on fixing
the above.
I've a "good" fix for 1). I'm not 100% sure I'll feel confident with
pushing if we wrap today. I am wondering if we shouldn't at least apply
the portion that unconditionally sends a signal in the ERROR
case. That's still an improvement.
One more thing:
Our testing infrastructure sucks. Without writing C code it's basically
impossible to test wraparounds and such. Even if not particularly useful
for non-devs, I really think we should have functions for creating
burning xids/multixacts in core. Or at least in some extension.
From | Date | Subject | |
---|---|---|---|
Next Message | otheus uibk | 2015-06-08 13:26:47 | Re: pg_start_backup does not actually allow for consistent, file-level backup |
Previous Message | otheus uibk | 2015-06-08 13:13:51 | Re: pg_start_backup does not actually allow for consistent, file-level backup |
From | Date | Subject | |
---|---|---|---|
Next Message | Geoff Winkless | 2015-06-08 13:21:31 | Re: [CORE] Restore-reliability mode |
Previous Message | Andrew Dunstan | 2015-06-08 13:09:27 | Re: Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file |