From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Abhijit Menon-Sen <ams(at)2ndQuadrant(dot)com> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [BUGS] Breakage with VACUUM ANALYSE + partitions |
Date: | 2016-04-22 17:07:36 |
Message-ID: | 20160422170736.bjrsusb4ubdhdtl4@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
On 2016-04-14 11:09:29 +0530, Abhijit Menon-Sen wrote:
> At 2016-04-12 09:00:57 -0400, robertmhaas(at)gmail(dot)com wrote:
> >
> > On Mon, Apr 11, 2016 at 1:17 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > >
> > > 3) Actually handle the case of the last open segment not being
> > > RELSEG_SIZE properly in _mdfd_getseg() - mdnblocks() does so.
> >
> > #3 seems like it's probably about 15 years overdue, so let's do that
> > anyway.
>
> I don't have anything useful to contribute, I'm just trying to
> understand this fix.
>
> _mdfd_getseg() looks like this:
>
> targetseg = blkno / ((BlockNumber) RELSEG_SIZE);
> for (nextsegno = 1; nextsegno <= targetseg; nextsegno++)
> {
> Assert(nextsegno == v->mdfd_segno + 1);
>
> if (v->mdfd_chain == NULL)
> {
> /*
> * Normally we will create new segments only if authorized by the
> * caller (i.e., we are doing mdextend()). […]
> */
> if (behavior == EXTENSION_CREATE || InRecovery)
> {
> if (_mdnblocks(reln, forknum, v) < RELSEG_SIZE)
> /* mdextend */
> v->mdfd_chain = _mdfd_openseg(reln, forknum, +nextsegno, O_CREAT);
> }
> else
> {
> /* We won't create segment if not existent */
> v->mdfd_chain = _mdfd_openseg(reln, forknum, nextsegno, 0);
> }
> if (v->mdfd_chain == NULL)
> /* return NULL or ERROR */
> }
> v = v->mdfd_chain;
> }
>
> Do I understand correctly that the case of the "last open segment"
> (i.e., the one for which mdfd_chain == NULL) not being RELSEG_SIZE
> (i.e., _mdnblocks(reln, forknum, v) < RELSEG_SIZE) should not call
> _mfdf_openseg on nextsegno if behaviour is not EXTENSION_CREATE or
> InRecovery?
>
> And that "We won't create segment if not existent" should happen, but
> doesn't only because the next segment file wasn't removed earlier, so
> we have to add an extra check for that case?
>
> In other words, is something like the following what's needed here, or
> is there more to it?
It's a bit more complicated than that, but that's the principle. Thanks
for the patch, and sorry for my late response. See my attached version
for a more fleshed out version of this.
Looking at the code around this I found:
* _mdfd_getseg(), in contrast to mdnblocks() doesn't check segment size,
neither whether to continue with a too small, nor to error out with a
too big segment
* For, imo pretty bad, reasons in mdsync() we currently rely on
EXTENSION_RETURN_NULL to extend the file in recovery, and to set errno
to ENOENT. Brrr.
* we currently FATAL if a segment is too big - I've copied that
behaviour, but why not just ERROR out?
The attached patch basically adds the segment size checks to
_mdfd_getseg(), and doesn't perform extension, even in recovery, if
EXTENSION_REALLY_RETURN_NULL is passed.
This fixes the issue for me, both in the originally reported variant,
and in some more rudely setup version (i.e. rm'ing files).
I'm seing some weird behaviour (even with writeback flushing disabled)
with smgr invals and recovery/standbys, but I don't want to dive into it
before http://www.postgresql.org/message-id/CAD21AoDpZ6Xjg=gFrGPnSn4oTRRcwK1EBrWCq9OqOHuAcMMC=w@mail.gmail.com
is addressed, it's too likely to be related.
Greetings,
Andres Freund
Attachment | Content-Type | Size |
---|---|---|
0001-Don-t-open-formally-non-existant-segments-in-_mdfd_g.patch | text/x-patch | 5.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Mathias Kunter | 2016-04-22 20:59:49 | Re: BUG #14107: Major query planner bug regarding subqueries and indices |
Previous Message | Victor Yegorov | 2016-04-22 16:57:02 | Re: BUG #14107: Major query planner bug regarding subqueries and indices |
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2016-04-22 18:20:52 | Re: GIN data corruption bug(s) in 9.6devel |
Previous Message | Andres Freund | 2016-04-22 16:36:14 | Re: kqueue |