From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Dave Page <dpage(at)pgadmin(dot)org> |
Subject: | Re: problems on Solaris |
Date: | 2015-06-24 12:42:10 |
Message-ID: | 20150624124210.GN4797@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2015-05-31 01:09:18 +0200, Andres Freund wrote:
> On 2015-05-27 21:23:34 -0400, Robert Haas wrote:
> > > Oh wow, that's bad, and could explain a couple of the problems we're
> > > seing. One possible way to fix is to replace the sequence with if
> > > (!TAS(spin)) S_UNLOCK();. But that'd mean TAS() has to be a barrier,
> > > even if the lock isn't free - which e.g. isn't the case for PowerPC's
> > > implementation :(
> >
> > Another possibility is to make the fallback barrier implementation a
> > system call, like maybe kill(PostmasterPid, 0).
>
> It's not necessarily true that all system calls are effective
> barriers. I'm e.g. doubtful that kill(..., 0) is one as it only performs
> local error checking. It might be that the process existance check
> includes a lock that's sufficient, but I would not like to rely on
> it. Sending an actual signal probably would be, but has the potential of
> disrupting postmaster progress.
I thought about various other syscalls we could use, and your proposal
seems to be least worst. My idea of using waitpid() falls short because
it only works for child processes. I think the kind of systems that we
don't have barriers on, are unlikely to use complex stuff like RCU to
manage access to process hierarchies.
I reproduced the 'stuck' issue on x86 by #ifdef'ing out barrier support
- about 50% of the time test_shm_mq gets stuck. Replacing it with
kill(PostmasterPid, 0) "works". Unless somebody protests soon that's
what I'm going to commit. It surely is better than easily reproducible
hangs.
I'm wondering wether we should add a #warning to atomic.c if either the
fallback memory or compiler barrier is used? Might be annoying to people
using -Werror, but I doubt that's possible anyway on such old systems.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Kohei KaiGai | 2015-06-24 13:02:13 | Re: Foreign join pushdown vs EvalPlanQual |
Previous Message | Uriy Zhuravlev | 2015-06-24 11:30:21 | Re: WIP: Enhanced ALTER OPERATOR |