From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Scaling shared buffer eviction |
Date: | 2014-09-23 23:42:53 |
Message-ID: | 20140923234253.GJ2521@awork2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2014-09-23 19:21:10 -0400, Robert Haas wrote:
> On Tue, Sep 23, 2014 at 6:54 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > I think it might be possible to construct some cases where the spinlock
> > performs worse than the lwlock. But I think those will be clearly in the
> > minority. And at least some of those will be fixed by bgwriter.
Err, this should read bgreclaimer, not bgwriter.
> > As an
> > example consider a single backend COPY ... FROM of large files with a
> > relatively large s_b. That's a seriously bad workload for the current
> > victim buffer selection because frequently most of the needs to be
> > searched for a buffer with usagecount 0. And this patch will double the
> > amount of atomic ops during that.
>
> It will actually be far worse than that, because we'll acquire and
> release the spinlock for every buffer over which we advance the clock
> sweep, instead of just once for the whole thing.
I said double, because we already acquire the buffer header's spinlock
every tick.
> That's reason to hope that a smart bgreclaimer process may be a good
> improvement on top of this.
Right. That's what I was trying to say with bgreclaimer above.
> But I think it's right to view that as something we need to test vs.
> the baseline established by this patch.
Agreed. I think the possible downsides are at the very fringe of already
bad cases. That's why I agreed that you should go ahead.
> > Let me try to quantify that.
>
> Please do.
I've managed to find a ~1.5% performance regression. But the setup was
plain absurd. COPY ... FROM /tmp/... BINARY; of large bytea datums into
a fillfactor 10 table with the column set to PLAIN storage. With the
resulting table size chosen so it's considerably bigger than s_b, but
smaller than the dirty writeback limit of the kernel.
That's perfectly reasonable.
I can think of a couple other cases, but they're all similarly absurd.
> >> What's interesting is that I can't see in the perf output any real
> >> sign that the buffer mapping locks are slowing things down, but they
> >> clearly are, because when I take this patch and also boost
> >> NUM_BUFFER_PARTITIONS to 128, the performance goes up:
> >
> > What did events did you profile?
>
> cs.
Ah. My guess is that most of the time will probably actually be spent in
the lwlock's spinlock, not the the lwlock putting itself to sleep.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2014-09-23 23:56:24 | Re: [HACKERS] “Core” function in Postgres |
Previous Message | Michael Paquier | 2014-09-23 23:35:56 | Re: BRIN indexes - TRAP: BadArgument |