From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Florian Pflug <fgp(at)phlo(dot)org> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: cheaper snapshots |
Date: | 2011-07-28 13:05:54 |
Message-ID: | CA+Tgmoa-=+XzifRTeCCtzWjpkFHwWeSqdWwAAupgBr3D4R5PcQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Jul 28, 2011 at 4:16 AM, Florian Pflug <fgp(at)phlo(dot)org> wrote:
> On Jul28, 2011, at 04:51 , Robert Haas wrote:
>> One fly in the ointment is that 8-byte
>> stores are apparently done as two 4-byte stores on some platforms.
>> But if the counter runs backward, I think even that is OK. If you
>> happen to read an 8 byte value as it's being written, you'll get 4
>> bytes of the intended value and 4 bytes of zeros. The value will
>> therefore appear to be less than what it should be. However, if the
>> value was in the midst of being written, then it's still in the midst
>> of committing, which means that that XID wasn't going to be visible
>> anyway. Accidentally reading a smaller value doesn't change the
>> answer.
>
> That only works if the update of the most-significant word is guaranteed
> to be visible before the update to the lest-significant one. Which
> I think you can only enforce if you update the words individually
> (and use a fence on e.g. PPC32). Otherwise you're at the mercy of the
> compiler.
>
> Otherwise, the following might happen (with a 2-byte value instead of an
> 8-byte one, and the assumption that 1-byte stores are atomic while 2-bytes
> ones aren't. Just to keep the numbers smaller. The machine is assumed to be
> big-endian)
>
> The counter is at 0xff00
> Backends 1 decrements, i.e. does
> (1) STORE [counter+1] 0xff
> (2) STORE [counter], 0x00
>
> Backend 2 reads
> (1') LOAD [counter+1]
> (2') LOAD [counter]
>
> If the sequence of events is (1), (1'), (2'), (2), backend 2 will read
> 0xffff which is higher than it should be.
You're confusing two different things - I agree that you need a
spinlock around reading the counter, unless 8-byte loads and stores
are atomic.
What I'm saying can be done without a lock is reading the commit-order
value for a given XID. If that's in the middle of being updated, then
the old value was zero, so the scenario you describe can't occur.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2011-07-28 13:38:35 | Re: cheaper snapshots |
Previous Message | Robert Haas | 2011-07-28 13:03:29 | Re: cheaper snapshots |