| From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Cc: | Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Refactor pg_rewind code and make it work against a standby |
| Date: | 2020-11-15 15:10:53 |
| Message-ID: | 037e42e3-8109-40d5-5be4-36912e5a7b69@iki.fi |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 15/11/2020 09:07, Tom Lane wrote:
> I wrote:
>> Not sure if you noticed, but piculet has twice failed the
>> 007_standby_source.pl test that was added by 9c4f5192f:
>> ...
>> Now, I'm not sure what to make of that, but I can't help noticing that
>> piculet uses --disable-atomics while francolin uses --disable-spinlocks.
>> That leads the mind towards some kind of low-level synchronization
>> bug ...
>
> Or, maybe it's less mysterious than that. The failure looks like we
> have not waited long enough for the just-inserted row to get replicated
> to node C. That wait is implemented as
>
> $lsn = $node_a->lsn('insert');
> $node_b->wait_for_catchup('node_c', 'write', $lsn);
>
> which looks fishy ... shouldn't wait_for_catchup be told to
> wait for replay of that LSN, not just write-the-WAL?
Yep, quite right. Fixed that way, thanks for the debugging!
- Heikki
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Magnus Hagander | 2020-11-15 15:37:36 | Re: Online verification of checksums |
| Previous Message | Alexander Lakhin | 2020-11-15 15:00:00 | Re: More time spending with "delete pending" |