From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Atomic operations within spinlocks |
Date: | 2020-06-06 00:19:26 |
Message-ID: | 20200606001926.zin5getvvhqppnm2@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2020-06-04 19:33:02 -0700, Andres Freund wrote:
> But it looks like that code is currently buggy (and looks like it always
> has been), because we don't look at the nested argument when
> initializing the semaphore. So we currently allocate too many
> semaphores, without benefiting from them :(.
I wrote a patch for this, and when I got around to to testing it, I
found that our tests currently don't pass when using both
--disable-spinlocks and --disable-atomics. Turns out to not be related
to the issue above, but the global barrier support added in 13.
That *reads* two 64 bit atomics in a signal handler. Which is normally
fine, but not at all cool when atomics (or just 64 bit atomics) are
backed by spinlocks. Because we can "self interrupt" while already
holding the spinlock.
It looks to me that that's a danger whenever 64bit atomics are backed by
spinlocks, not just when both --disable-spinlocks and --disable-atomics
are used. But I suspect that it's really hard to hit the tiny window of
danger when those options aren't used. While we have buildfarm animals
testing each of those separately, we don't have one that tests both
together...
I'm not really sure what to do about that issue. The easisest thing
would probably be to change the barrier generation to 32bit (which
doesn't have to use locks for reads in any situation). I tested doing
that, and it fixes the hangs for me.
Randomly noticed while looking at the code:
uint64 flagbit = UINT64CONST(1) << (uint64) type;
that shouldn't be 64bit, right?
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Stephen Frost | 2020-06-06 00:36:32 | Re: WIP: WAL prefetch (another approach) |
Previous Message | Tomas Vondra | 2020-06-05 23:17:26 | Re: Trouble with hashagg spill I/O pattern and costing |