From: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
---|---|
To: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, masao(dot)fujii(at)oss(dot)nttdata(dot)com, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: A failure of standby to follow timeline switch |
Date: | 2021-01-13 03:08:30 |
Message-ID: | CAHGQGwGtAqCd0SaAiFSS78imhxuLzSzu0oQ7oFbHjqOD_1zxEA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Jan 13, 2021 at 10:48 AM Kyotaro Horiguchi
<horikyota(dot)ntt(at)gmail(dot)com> wrote:
>
> At Tue, 12 Jan 2021 10:47:21 +0900, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote in
> > On Sat, Jan 9, 2021 at 5:08 AM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> > >
> > > Masao-san: Are you intending to act as committer for these? Since the
> > > bug is mine I can look into it, but since you already did all the
> > > reviewing work, I'm good with you giving it the final push.
> >
> > Thanks! I'm thinking to push the patch.
> >
> >
> > > 0001 looks good to me; let's get that one committed quickly so that we
> > > can focus on the interesting stuff. While the implementation of
> > > find_in_log is quite dumb (not this patch's fault), it seems sufficient
> > > to deal with small log files. We can improve the implementation later,
> > > if needed, but we have to get the API right on the first try.
> > >
> > > 0003: The fix looks good to me. I verified that the test fails without
> > > the fix, and it passes with the fix.
> >
> > Yes.
> >
> >
> > > The test added in 0002 is a bit optimistic regarding timing, as well as
> > > potentially slow; it loops 1000 times and sleeps 100 milliseconds each
> > > time. In a very slow server (valgrind or clobber_cache animals) this
> > > could not be sufficient time, while on fast servers it may end up
> > > waiting longer than needed. Maybe we can do something like this:
> >
> > On second thought, I think that the regression test should be in
> > 004_timeline_switch.pl instead of 001_stream_rep.pl because it's
>
> Agreed. It's definitely the right place.
>
> > the test about timeline switch. Also I'm thinking that it's better to
> > test the timeline switch by checking whether some data is successfully
> > replicatead like the existing regression test for timeline switch in
> > 004_timeline_switch.pl does, instead of finding the specific message
> > in the log file. I attached the POC patch. Thought?
>
> It's practically a check on this issue, and looks better. The 180s
> timeout in the failure case seems a bit annoying but it's the way all
> of this kind of test follow.
Yes.
>
> The last check on table content is actually useless but it might make
> sense to confirm that replication is actually working. However, I
> don't think the test don't need to insert as many as 1000 tuples. Just
> a single tuple would suffice.
Thanks for the review!
I'm ok with this change (i.e., insert only single row).
Attached is the updated version of the patch.
Regards,
--
Fujii Masao
Attachment | Content-Type | Size |
---|---|---|
v6_follow_timeline_switch.patch | application/octet-stream | 2.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Craig Ringer | 2021-01-13 03:41:58 | Re: Logical decoding without slots: decoding in lockstep with recovery |
Previous Message | Hou, Zhijie | 2021-01-13 02:40:54 | remove unneeded pstrdup in fetch_table_list |