From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Pg Bugs <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Re: PITR potentially broken in 9.2 |
Date: | 2012-12-05 02:37:17 |
Message-ID: | 20121205023717.GB8970@awork2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
On 2012-12-04 21:27:34 -0500, Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> >> But the key is, the database was not actually consistent at that
> >> point, and so opening hot standby was a dangerous thing to do.
> >>
> >> The bug that allowed the database to open early (the original topic if
> >> this email chain) was masking this secondary issue.
>
> > Could you check whether the attached patch fixes the behaviour?
>
> Yeah, I had just come to the same conclusion: the bug is not with
> Heikki's fix, but with the pause logic. The comment says that it
> shouldn't pause unless users can connect to un-pause, but the actual
> implementation of that test is several bricks shy of a load.
>
> Your patch is better, but it's still missing a bet: what we need to be
> sure of is not merely that we *could* have told the postmaster to start
> hot standby, but that we *actually have done so*. Otherwise, it's
> flow-of-control-dependent whether it works or not; we could fail if the
> main loop hasn't gotten to CheckRecoveryConsistency since the conditions
> became true. Looking at the code in CheckRecoveryConsistency, testing
> LocalHotStandbyActive seems appropriate.
Good point.
My patch wasn't intended as something final, I just wanted confirmation
that it achieves what Jeff wants because I at least - you possibly as
well? - misunderstood him earlier. Would probably have missed the
interaction above anyway ;)
> I also thought it was pretty dangerous that this absolutely critical
> test was not placed in recoveryPausesHere() itself, rather than relying
> on the call sites to remember to do it.
Absolutely aggreed.
> So the upshot is that I propose a patch more like the attached.
Without having run anything so far it looks good to me.
Ok, night now,
Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2012-12-05 02:53:22 | Re: PITR potentially broken in 9.2 |
Previous Message | Tom Lane | 2012-12-05 02:27:34 | Re: PITR potentially broken in 9.2 |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2012-12-05 02:53:22 | Re: PITR potentially broken in 9.2 |
Previous Message | Tom Lane | 2012-12-05 02:27:34 | Re: PITR potentially broken in 9.2 |