| From: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> | 
|---|---|
| To: | const_sunny(at)126(dot)com | 
| Cc: | PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org> | 
| Subject: | Re: BUG #14721: Assertion of synchronous replication | 
| Date: | 2017-06-29 05:11:47 | 
| Message-ID: | CAEepm=33AAJFu9CMPhCsnX-Zg6ZySrga_oAjRV0Dtgx2G03kpQ@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-bugs | 
On Thu, Jun 29, 2017 at 4:27 PM, Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>         /*
>          * Acquiring the lock is not needed, the latch ensures proper
>          * barriers. If it looks like we're done, we must really be done,
>          * because once walsender changes the state to SYNC_REP_WAIT_COMPLETE,
>          * it will never update it again, so we can't be seeing a stale value
>          * in that case.
>          */
Yeah, counting on the latch for free barriers doesn't work if you
happen to see SYNC_REP_WAIT_COMPLETE first time through the loop, or
if you see it after a spurious signal woke you and then it's
immediately set to SYNC_REP_WAIT_COMPLETE.  In those cases, the
following Assert statement is making an assertion about cache
coherency that doesn't work even on a friendly TSO system.
Can you reproduce the problem with this experimental patch applied?
-- 
Thomas Munro
http://www.enterprisedb.com
| Attachment | Content-Type | Size | 
|---|---|---|
| barriers.patch | application/octet-stream | 1.2 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Michal Novotny | 2017-06-29 09:01:17 | Segmentation fault in libpq | 
| Previous Message | const_sunny@126.com | 2017-06-29 04:30:09 | Assertion of synchronous replication |