From: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: race condition in sync rep |
Date: | 2011-03-26 10:16:36 |
Message-ID: | AANLkTi=H2SKD6zjxZLps48uN_yY9cKzNriz1Cbot3hHr@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Mar 26, 2011 at 1:11 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I believe I've figured out why synchronous replication has such
> terrible performance with fsync=off: it has a nasty race condition.
> It may happen - if the standby responds very quickly - that the
> standby acks the commit record and awakens waiters before the
> committing backend actually begins to wait. There's no cross-check
> for this: the committing backend waits unconditionally, with no regard
> to whether the necessary ACK has already arrived. At this point we
> may be in for a very long wait: another ACK will be required to
> release waiters, and that may not be immediately forthcoming. I had
> thought that the next ACK (after at most wal_receiver_status_interval)
> would do the trick, but it appears to be even worse than that: by
> making the standby win the race, I was easily able to get the master
> to hang for over a minute, and it only got released when I committed
> another transaction. Had I been sufficiently patient, the next
> checkpoint probably would have done the trick.
>
> Of course, with fsync=off on the standby, it's much easier for the
> standby to win the race.
That seems very unlikely even with fsync=off in a real config where we
have network path to consider.
Your definition of a "nasty" race condition seems off.
I've added code for you.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2011-03-26 13:41:46 | Re: WIP: Allow SQL-language functions to reference parameters by parameter name |
Previous Message | Simon Riggs | 2011-03-26 09:38:56 | Re: Open issues for collations |