From: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | pguo(at)pivotal(dot)io |
Cc: | apraveen(at)pivotal(dot)io, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: standby recovery fails (tablespace related) (tentative patch and discussion) |
Date: | 2019-04-22 07:40:27 |
Message-ID: | 20190422.164027.33866403.horiguchi.kyotaro@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Oops! The comment in the previous patch is wrong.
At Mon, 22 Apr 2019 16:15:13 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote in <20190422(dot)161513(dot)258021727(dot)horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
> At Mon, 22 Apr 2019 12:36:43 +0800, Paul Guo <pguo(at)pivotal(dot)io> wrote in <CAEET0ZGpUrMGUzfyzVF9FuSq+zb=QovYa2cvyRnDOTvZ5vXxTw(at)mail(dot)gmail(dot)com>
> > Please see my replies inline. Thanks.
> >
> > On Fri, Apr 19, 2019 at 12:38 PM Asim R P <apraveen(at)pivotal(dot)io> wrote:
> >
> > > On Wed, Apr 17, 2019 at 1:27 PM Paul Guo <pguo(at)pivotal(dot)io> wrote:
> > > >
> > > > create db with tablespace
> > > > drop database
> > > > drop tablespace.
> > >
> > > Essentially, that sequence of operations causes crash recovery to fail
> > > if the "drop tablespace" transaction was committed before crashing.
> > > This is a bug in crash recovery in general and should be reproducible
> > > without configuring a standby. Is that right?
> > >
> >
> > No. In general, checkpoint is done for drop_db/create_db/drop_tablespace on
> > master.
> > That makes the file/directory update-to-date if I understand the related
> > code correctly.
> > For standby, checkpoint redo does not ensure that.
>
> That's right partly. As you must have seen, fast shutdown forces
> restartpoint for the last checkpoint and it prevents the problem
> from happening. Anyway it seems to be a problem.
>
> > > Your patch creates missing directories in the destination. Don't we
> > > need to create the tablespace symlink under pg_tblspc/? I would
> > >
> >
> > 'create db with tablespace' redo log does not include the tablespace real
> > directory information.
> > Yes, we could add in it into the xlog, but that seems to be an overdesign.
>
> But I don't think creating directory that is to be removed just
> after is a wanted solution. The directory most likely to be be
> removed just after.
>
> > > prefer extending the invalid page mechanism to deal with this, as
> > > suggested by Ashwin off-list. It will allow us to avoid creating
> >
> > directories and files only to remove them shortly afterwards when the
> > > drop database and drop tablespace records are replayed.
> > >
> > >
> > 'invalid page' mechanism seems to be more proper for missing pages of a
> > file. For
> > missing directories, we could, of course, hack to use that (e.g. reading
> > any page of
> > a relfile in that database) to make sure the tablespace create code
> > (without symlink)
> > safer (It assumes those directories will be deleted soon).
> >
> > More feedback about all of the previous discussed solutions is welcome.
>
> It doesn't seem to me that the invalid page mechanism is
> applicable in straightforward way, because it doesn't consider
> simple file copy.
>
> Drop failure is ignored any time. I suppose we can ignore the
> error to continue recovering as far as recovery have not reached
> consistency. The attached would work *at least* your case, but I
> haven't checked this covers all places where need the same
> treatment.
The comment for the new function XLogMakePGDirectory is wrong:
+ * There is a possibility that WAL replay causes a creation of the same
+ * directory left by the previous crash. Issuing ERROR prevents the caller
+ * from continuing recovery.
The correct one is:
+ * There is a possibility that WAL replay causes an error by creation of a
+ * directory under a directory removed before the previous crash. Issuing
+ * ERROR prevents the caller from continuing recovery.
It is fixed in the attached.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachment | Content-Type | Size |
---|---|---|
ignore_dir_create_error_before_consistency_v2.patch | text/x-patch | 2.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Langote | 2019-04-22 07:49:19 | display of variables in EXPLAIN VERBOSE |
Previous Message | Konstantin Knizhnik | 2019-04-22 07:38:18 | Re: block-level incremental backup |