Re: CLOG contention

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: CLOG contention
Date: 2011-12-21 19:43:13
Message-ID: CA+TgmoaH5Zn2k0=Ug33+k4Yxv5a-3ATjzwHtjhtV2hiSN0XyXg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 21, 2011 at 12:48 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On the other hand, if we just want to avoid having more requests
> simultaneously in flight than we have buffers, so that backends don't
> need to wait for an available buffer before beginning their I/O, then
> something on the order of the number of CPUs in the machine is likely
> sufficient.  I'll do a little more testing and see if I can figure out
> where the tipping point is on this 32-core box.

I recompiled with NUM_CLOG_BUFFERS = 8, 16, 24, 32, 40, 48 and ran
5-minute tests, using unlogged tables to avoid getting killed by
WALInsertLock contentions. With 32-clients on this 32-core box, the
tipping point is somewhere in the neighborhood of 32 buffers. 40
buffers might still be winning over 32, or maybe not, but 48 is
definitely losing. Below 32, more is better, all the way up. Here
are the full results:

resultswu.clog16.32.100.300:tps = 19549.454462 (including connections
establishing)
resultswu.clog16.32.100.300:tps = 19883.583245 (including connections
establishing)
resultswu.clog16.32.100.300:tps = 19984.857186 (including connections
establishing)
resultswu.clog24.32.100.300:tps = 20124.147651 (including connections
establishing)
resultswu.clog24.32.100.300:tps = 20108.504407 (including connections
establishing)
resultswu.clog24.32.100.300:tps = 20303.964120 (including connections
establishing)
resultswu.clog32.32.100.300:tps = 20573.873097 (including connections
establishing)
resultswu.clog32.32.100.300:tps = 20444.289259 (including connections
establishing)
resultswu.clog32.32.100.300:tps = 20234.209965 (including connections
establishing)
resultswu.clog40.32.100.300:tps = 21762.222195 (including connections
establishing)
resultswu.clog40.32.100.300:tps = 20621.749677 (including connections
establishing)
resultswu.clog40.32.100.300:tps = 20290.990673 (including connections
establishing)
resultswu.clog48.32.100.300:tps = 19253.424997 (including connections
establishing)
resultswu.clog48.32.100.300:tps = 19542.095191 (including connections
establishing)
resultswu.clog48.32.100.300:tps = 19284.962036 (including connections
establishing)
resultswu.master.32.100.300:tps = 18694.886622 (including connections
establishing)
resultswu.master.32.100.300:tps = 18417.647703 (including connections
establishing)
resultswu.master.32.100.300:tps = 18331.718955 (including connections
establishing)

Parameters in use: shared_buffers = 8GB, maintenance_work_mem = 1GB,
synchronous_commit = off, checkpoint_segments = 300,
checkpoint_timeout = 15min, checkpoint_completion_target = 0.9,
wal_writer_delay = 20ms

It isn't clear to me whether we can extrapolate anything more general
from this. It'd be awfully interesting to repeat this experiment on,
say, an 8-core server, but I don't have one of those I can use at the
moment.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-12-21 20:16:39 Re: RangeVarGetRelid()
Previous Message Greg Smith 2011-12-21 19:35:50 Re: Page Checksums