Quick Links

Re: [PERFORM] Cpu usage 100% on slave. s_lock problem.

From:	Andres Freund <andres(at)2ndquadrant(dot)com>
To:	Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc:	Merlin Moncure <mmoncure(at)gmail(dot)com>, Дмитрий Дегтярёв <degtyaryov(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [PERFORM] Cpu usage 100% on slave. s_lock problem.
Date:	2013-09-27 21:12:17
Message-ID:	20130927211217.GA9819@awork2.anarazel.de
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 2013-09-27 23:28:37 +0300, Heikki Linnakangas wrote:
> On 27.09.2013 22:00, Merlin Moncure wrote:
> >Attached is simplified patch that replaces the spinlock with a read
> >barrier based on a suggestion made by Andres offlist. The approach
> >has different performance characteristics -- a barrier call is being
> >issued instead of a non-blocking read. I don't have a performance
> >test case in hand to prove that's better so I'm going with Andre's
> >approach because it's simpler.
>
> Can you think of a concrete example of what might go wrong if we simply
> removed the spinlock, and did an unprotected read through the volatile
> pointer? Is there some particular call sequence that might get optimized in
> an unfortunate way? I'm not suggesting that we do that - better safe than
> sorry even if we can't think of any particular scenario - but it would help
> me understand this.

I think in some architecutres the write from the other side might just
not immediately be visible, so RecoveryInProgress() might return a
spurious true, although we already finished.
I haven't checked whether there's any callers that would have a problem
with that.

> I don't think a read-barrier is enough here. See README.barrier:
>
> >When a memory barrier is being used to separate two
> >loads, use pg_read_barrier(); when it is separating two stores, use
> >pg_write_barrier(); when it is a separating a load and a store (in either
> >order), use pg_memory_barrier().

I think the documentation is just worded in an easy to misunderstand
way. If you look at the following example, we're fine.
The "two stores", "two loads" it is talking about are what's happening
in one thread of execution. The important part of a read barrier is that
it forces a fresh LOAD, separated from earlier LOADs to the same
address.

> The function RecoveryInProgress() function does just one load, to read the
> variable, and wouldn't even need a barrier by itself. The other load or
> store that needs to be protected by the barrier happens in the caller,
> before or after the function, and we can't say for sure if it's a load or a
> store. So, let's use pg_memory_barrier().

The caller uses a spinlock, so it's guaranteed to write out before the
spinlock is released. A write barrier (the spinlock in the startup
process) should always be paired by a read barrier.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Re: [PERFORM] Cpu usage 100% on slave. s_lock problem. at 2013-09-27 20:28:37 from Heikki Linnakangas

Responses

Re: [PERFORM] Cpu usage 100% on slave. s_lock problem. at 2013-09-27 21:53:59 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andres Freund	2013-09-27 21:18:24	Re: logical changeset generation v6
Previous Message	Steve Singer	2013-09-27 21:06:59	Re: logical changeset generation v6