From: | Noah Misch <noah(at)2ndQuadrant(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: sinval synchronization considered harmful |
Date: | 2011-07-26 20:38:08 |
Message-ID: | 20110726203804.GB17857@tornado.leadboat.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Jul 26, 2011 at 01:36:26PM -0400, Robert Haas wrote:
> On Mon, Jul 25, 2011 at 6:24 PM, Noah Misch <noah(at)2ndquadrant(dot)com> wrote:
> > On Fri, Jul 22, 2011 at 03:54:03PM -0400, Robert Haas wrote:
> >> On Fri, Jul 22, 2011 at 3:28 PM, Noah Misch <noah(at)2ndquadrant(dot)com> wrote:
> >> > This is attractive, and I don't see any problems with it. (In theory, you could
> >> > hit a case where the load of resetState gives an ancient "false" just as the
> >> > counters wrap to match. Given that the wrap interval is 1000000x as long as the
> >> > reset interval, I'm not worried about problems on actual silicon.)
> >>
> >> It's actually 262,144 times as long - see MSGNUMWRAPAROUND.
> >
> > Ah, so it is.
> >
> >> It would be pretty easy to eliminate even the theoretical possibility
> >> of a race by getting rid of resetState altogether and using nextMsgNum
> >> = -1 to mean that. Maybe I should go ahead and do that.
> >
> > Seems like a nice simplification.
>
> On further reflection, I don't see that this helps: it just moves the
> problem around.
Alas, yes.
> With resetState as a separate variable, nextMsgNum is
> never changed by anyone other than the owner, so we can never have a
> stale load.
It's also changed when SICleanupQueue() decides to wrap everyone. This could
probably be eliminated by using uint32 and letting overflow take care of wrap
implicitly, but ...
> But if we overload nextMsgNum to also indicate whether
> our state has been reset, then there's a race between when we load
> nextMsgNum and when we load maxMsgNum (instead of code I posted
> previously, which has a race between when we load resetState and when
> we load maxMsgNum). Now, as you say, it seems really, really
> difficult to hit that in practice, but I don't see a way of getting
> rid of the theoretical possibility without either (1) a spinlock or
> (2) a fence. (Of course, on x86, the fence could be optimized down to
> a compiler barrier.)
No new ideas come to mind, here. We could migrate back toward your original
proposal of making the counter a non-wrapping uint64; Florian described some
nice techniques for doing atomic 64-bit reads on x86:
http://archives.postgresql.org/message-id/7C94C386-122F-4918-8624-A14352E8DBC5@phlo.org
I'm not personally acquainted with those approaches, but they sound promising.
Since the overlap between 32-bit installations and performance-sensitive
installations is all but gone, we could even just use a spinlock under 32-bit.
> I guess the question is "should we worry about that?".
I'd lean toward "no". I share Tom's unease about writing off a race condition
as being too improbable, but this is quite exceptionally improbable. On the
other hand, some of the fixes don't look too invasive.
--
Noah Misch http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Florian Pflug | 2011-07-26 20:44:56 | XMLATTRIBUTES vs. values of type XML |
Previous Message | Peter Eisentraut | 2011-07-26 20:30:16 | pgsql: Add missing newlines at end of error messages |