From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | Serge Negodyuck <petr(at)petrovich(dot)kiev(dot)ua>, Pg Bugs <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby |
Date: | 2014-06-09 21:15:24 |
Message-ID: | CAMkU=1x4aTiQjECT0TcVDnM=Sgi_vG_3XfYB6rajR82Ym6PjJg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
On Fri, Jun 6, 2014 at 9:47 AM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
wrote:
> Alvaro Herrera wrote:
> > Serge Negodyuck wrote:
> >
> > > 2014-06-02 08:20:55 EEST 172.18.10.4 db PANIC: could not access status
> of
> > > transaction 2080547
> > > 2014-06-02 08:20:55 EEST 172.18.10.4 db DETAIL: Could not open file
> > > "pg_multixact/members/14078": No such file or directory.
> > > 2014-06-02 08:20:55 EEST 172.18.10.4 db CONTEXT: SQL statement "
> UPDATE
> > > ....."
> >
> > So as it turns out, this was caused because the arithmetic to handle the
> > wraparound case neglected to handle multixacts with more members than
> > the number that fit in the last page(s) of the last segment, leading to
> > a number of pages in the 14078 segment (or whatever the last segment is
> > for a given BLCKSZ) to fail to be initialized. This patch is a rework
> > of that arithmetic, although it seems little bit too obscure.
>
> After some simplification I think it should be clearer. Thanks Andres
> for commenting offlist.
>
> There is a different way to compute the "difference" proposed by Andres,
> without using if/else; the idea is to cast the values to int64 and then
> clamp. It would be something like
>
> uint64 diff64;
>
> diff64 = Min((uint64) offset + nmembers,
> (uint64) offset + MULTIXACT_MEMBERS_PER_PAGE);
> difference = (uint32) Min(diff64, MaxMultiXactOffset);
>
> (There are other ways to formulate this, of course, but this seems to me
> to be the most readable one). I am undecided between this and the one I
> propose in the patch, so I've stuck with the patch.
>
This patch seems to solve a problem I've also been having with non-existent
"pg_multixact/members/13D35" files in my testing of torn page write and fk
locks again recent 9.4. However, sometimes I've found the problem even
before multi wrap around occured, and my reading of this thread suggests
that the current issue would not show up in that case, so I don't know how
much comfort to take in the knowledge that the problem seems to have gone
away--as they say, things that go away by themselves without explanation
also tend to come back by themselves.
Anyone wishing to investigate further can find the testing harness and the
tarball of a bad data directory here:
https://drive.google.com/folderview?id=0Bzqrh1SO9FcEb1FNcm52aEMtWTA&usp=sharing
Starting up the bad data directory should nominally succeed, but then the
first database wide vacuum should fail when it hits the bad tuple.
Recent changes to my testing harness were:
Making some of the transactions randomly rollback in the test harness, to
test transaction aborts not associated with server crashes.
Make the p_id in the child table get changed only half the time, not all
the time, so that a mixture of HOT updates and non-HOT updates get tested.
Add a patch to allow fast-forward of multixact and add a code to my harness
to trigger that patch. It is possible that this patch itself is causing
the problem, but I suspect it is just accelerating the discovery of it.
(By the way, I did this wrong for my original intent, as I had it create
quite large multixact, when I should have had it create more of them but
smaller, so wrap around would occur more often. But I don't want to change
it now until the problem is resolved)
Add a delay.pl which delivers SIGSTOP to random postgres processes and then
wait a while to SIGCONT them, to try to uncover unlikely races.
Surprisingly this does not slow things by very much. I thought it would
be frequent that processes got interrupted while holding important locks,
but it actually seems to be rare.
Turn on archive_mode and set wal_level to logical.
The problem is that the last two of them (the delay.pl and the archive) I
can reverse and still see the problem, while the other 3 were extensively
tested elsewhere without seeing a problem, so I can't figure out what the
trigger is.
When the problem shows up it takes anywhere from 1 to 13 hours to do so.
Anyway, if no one has opinions to the contrary I think I will just assume
this is fixed now and move on to other tests.
Cheers,
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | tysonzwicker | 2014-06-09 22:35:26 | BUG #10588: inconsistent permissions for InformationSchema tables.. |
Previous Message | Alvaro Herrera | 2014-06-09 19:49:59 | Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby |
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2014-06-09 21:28:51 | Re: Allowing NOT IN to use ANTI joins |
Previous Message | Jim Nasby | 2014-06-09 20:23:51 | Re: "RETURNING PRIMARY KEY" syntax extension |