From: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
---|---|
To: | robertmhaas(at)gmail(dot)com |
Cc: | alvherre(at)alvh(dot)no-ip(dot)org, michael(at)paquier(dot)xyz, rjuju123(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: standby recovery fails (tablespace related) (tentative patch and discussion) |
Date: | 2022-04-04 08:29:48 |
Message-ID: | 20220404.172948.678193664696814690.horikyota.ntt@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
At Fri, 1 Apr 2022 14:51:58 -0400, Robert Haas <robertmhaas(at)gmail(dot)com> wrote in
> On Fri, Apr 1, 2022 at 12:22 AM Kyotaro Horiguchi
> <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > By the way, may I ask how do we fix this? The existing recovery code
> > already generates just-to-be-delete files in a real directory in
> > pg_tblspc sometimes, and elsewise skip applying WAL records on
> > nonexistent heap pages. It is the "mixed" way.
>
> Can you be more specific about where we have each behavior now?
They're done in XLogReadBufferExtended.
The second behavior happens here,
xlogutils.c:
> /* hm, page doesn't exist in file */
> if (mode == RBM_NORMAL)
> {
> log_invalid_page(rnode, forknum, blkno, false);
+ Assert(0);
> return InvalidBuffer;
With the assertion, 015_promotion_pages.pl crashes. This prevents page
creation and the following redo action on the page.
The first behavior is described as the following comment:
> * Create the target file if it doesn't already exist. This lets us cope
> * if the replay sequence contains writes to a relation that is later
> * deleted. (The original coding of this routine would instead suppress
> * the writes, but that seems like it risks losing valuable data if the
> * filesystem loses an inode during a crash. Better to write the data
> * until we are actually told to delete the file.)
> */
> smgrcreate(smgr, forknum, true);
Without the smgrcreate call, make check-world fails due to missing
files for FSM and visibility map, and init forks, which it's a bit
doubtful that the cases fall into the category so-called "creates
inexistent objects by redo access". In a few places, XLOG_FPI records
are used to create the first page of a file including main and init
forks. But I don't see a case of main fork during make check-world.
# Most of the failure cases happen as standby freeze. I was a bit
# annoyed that make check-world doesn't tell what is the module
# currently being tested. In that case I had to deduce it from the
# sequence of preceding script names, but if the first TAP script of a
# module freezes, I had to use ps to find the module..
> > 1. stop XLogReadBufferForRedo creating a file in nonexistent
> > directories then remember the failure (I'm not sure how big the
> > impact is.)
> >
> > 2. unconditionally create all objects required for recovery to proceed..
> > 2.1 and igore the failures.
> > 2.2 and remember the failures.
> >
> > 3. Any other?
> >
> > 2 needs to create a real directory in pg_tblspc. So 1?
>
> I think we could either do 1 or 2. My intuition is that getting 2
> working would be less scary and more likely to be something we would
> feel comfortable back-patching, but 1 is probably a better design in
> the long term. However, I might be wrong -- that's just a guess.
Thanks. I forgot to mention in the previous mail (but mentioned
somewhere upthread) but if we take 2, there's no way other than
creating a real directory in pg_tblspc while recovery. I don't think
it is neat.
I haven't found how the patch caused creation of a relation file that
is to be removed soon. However, I find that v19 patch fails by maybe
due to some change in Cluster.pm. It takes a bit more time to check
that..
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Shelepanov | 2022-04-04 08:51:10 | collect_corrupt_items_vacuum.patch |
Previous Message | Andrey V. Lepikhov | 2022-04-04 08:27:45 | Re: Removing unneeded self joins |