From: | Florian Pflug <fgp(at)phlo(dot)org> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: cheaper snapshots |
Date: | 2011-07-28 08:16:41 |
Message-ID: | 7CC1D2D9-1944-42B1-B4AA-D308D424A5A9@phlo.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Jul28, 2011, at 04:51 , Robert Haas wrote:
> One fly in the ointment is that 8-byte
> stores are apparently done as two 4-byte stores on some platforms.
> But if the counter runs backward, I think even that is OK. If you
> happen to read an 8 byte value as it's being written, you'll get 4
> bytes of the intended value and 4 bytes of zeros. The value will
> therefore appear to be less than what it should be. However, if the
> value was in the midst of being written, then it's still in the midst
> of committing, which means that that XID wasn't going to be visible
> anyway. Accidentally reading a smaller value doesn't change the
> answer.
That only works if the update of the most-significant word is guaranteed
to be visible before the update to the lest-significant one. Which
I think you can only enforce if you update the words individually
(and use a fence on e.g. PPC32). Otherwise you're at the mercy of the
compiler.
Otherwise, the following might happen (with a 2-byte value instead of an
8-byte one, and the assumption that 1-byte stores are atomic while 2-bytes
ones aren't. Just to keep the numbers smaller. The machine is assumed to be
big-endian)
The counter is at 0xff00
Backends 1 decrements, i.e. does
(1) STORE [counter+1] 0xff
(2) STORE [counter], 0x00
Backend 2 reads
(1') LOAD [counter+1]
(2') LOAD [counter]
If the sequence of events is (1), (1'), (2'), (2), backend 2 will read
0xffff which is higher than it should be.
But we could simply use a spin-lock to protect the read on machines where
we don't know for sure that 64-bit reads and write are atomic. That'll
only really hurt on machines with 16+ cores or so, and the number of
architectures which support that isn't that high anyway. If we supported
spinlock-less operation on SPARC, x86-64, PPC64 and maybe Itanium, would we
miss any important one?
best regards,
Florian Pflug
From | Date | Subject | |
---|---|---|---|
Next Message | Hannu Krosing | 2011-07-28 10:50:49 | Re: cheaper snapshots |
Previous Message | Simon Riggs | 2011-07-28 07:46:38 | Re: cheaper snapshots |