From: | Paul Guo <pguo(at)pivotal(dot)io> |
---|---|
To: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> |
Cc: | PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: standby recovery fails (tablespace related) (tentative patch and discussion) |
Date: | 2019-05-13 09:37:50 |
Message-ID: | CAEET0ZF9yN4DaXyuFLzOcAYyxuFF1Ms_OQWeA+Rwv3GhA5Q-SA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Thanks for the reply.
On Tue, May 7, 2019 at 2:47 PM Kyotaro HORIGUCHI <
horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>
> + if (!(stat(parent_path, &st) == 0 && S_ISDIR(st.st_mode)))
> + {
>
> This patch is allowing missing source and destination directory
> even in consistent state. I don't think it is safe.
>
I do not understand this. Can you elaborate?
>
>
>
> + ereport(WARNING,
> + (errmsg("directory \"%s\" for copydir() does not exists."
> + "It is possibly expected. Skip copydir().",
> + parent_path)));
>
> This message seems unfriendly to users, or it seems like an elog
> message. How about something like this. The same can be said for
> the source directory.
>
> | WARNING: skipped creating database directory: "%s"
> | DETAIL: The tabelspace %u may have been removed just before crash.
>
Yeah. Looks better.
>
> # I'm not confident in this at all:(
>
> > 2) Fixed dbase_desc(). Now the xlog output looks correct.
> >
> > rmgr: Database len (rec/tot): 42/ 42, tx: 486, lsn:
> > 0/016386A8, prev 0/01638630, desc: CREATE copy dir base/1 to
> > pg_tblspc/16384/PG_12_201904281/16386
> >
> > rmgr: Database len (rec/tot): 34/ 34, tx: 487, lsn:
> > 0/01638EB8, prev 0/01638E40, desc: DROP dir
> > pg_tblspc/16384/PG_12_201904281/16386
>
> WAL records don't convey such information. The previous
> description seems right to me.
>
2019-04-17 14:52:14.951 CST [23030] CONTEXT: WAL redo at 0/3011650 for
Database/CREATE: copy dir 1663/1 to 65546/65547
The directories are definitely wrong and misleading.
> > > Also I'd suggest we should use pg_mkdir_p() in
> TablespaceCreateDbspace()
> > > to replace
> > > the code block includes a lot of
> > > get_parent_directory(), MakePGDirectory(), etc even it
> > > is not fixing a bug since pg_mkdir_p() code change seems to be more
> > > graceful and simpler.
>
> But I don't agree to this. pg_mkdir_p goes above two-parents up,
> which would be unwanted here.
>
> I do not understand this also. pg_mkdir_p() is similar to 'mkdir -p'.
This change just makes the code concise. Though in theory the change is not
needed.
> > > Whatever ignore mkdir failure or mkdir_p, I found that these steps
> seem to
> > > be error-prone
> > > along with postgre evolving since they are hard to test and also we are
> > > not easy to think out
> > > various potential bad cases. Is it possible that we should do real
> > > checkpoint (flush & update
> > > redo lsn) when seeing checkpoint xlogs for these operations? This will
> > > slow down standby
> > > but master also does this and also these operations are not usual,
> > > espeically it seems that it
> > > does not slow down wal receiving usually?
>
> That dramatically slows recovery (not replication) if databases
> are created and deleted frequently. That wouldn't be acceptable.
>
This behavior is rare and seems to have the same impact on master & standby
from checkpoint/restartpoint.
We do not worry about master so we should not worry about standby also.
From | Date | Subject | |
---|---|---|---|
Next Message | Masahiko Sawada | 2019-05-13 10:28:25 | Re: vacuumdb and new VACUUM options |
Previous Message | Oleg Bartunov | 2019-05-13 09:14:48 | Re: PostgreSQL 12: Feature Highlights |