From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Martijn van Oosterhout <kleptog(at)svana(dot)org>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: mosbench revisited |
Date: | 2011-08-04 01:16:14 |
Message-ID: | CA+TgmoYeS+RgQvnQEYNpA7JCjjd_0SkjSwF29Lrsy+vkGxcvrQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Aug 3, 2011 at 5:35 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> That still seems utterly astonishing to me. We're touching each of
> those files once per query cycle; a cycle that contains two message
> sends, who knows how many internal spinlock/lwlock/heavyweightlock
> acquisitions inside Postgres (some of which *do* contend with each
> other), and a not insignificant amount of plain old computing.
> Meanwhile, this particular spinlock inside the kernel is protecting
> what, a single doubleword fetch? How is that the bottleneck?
Spinlocks seem to have a very ugly "tipping point". When I tested
pgbench -S on a 64-core system with the lazy vxid patch applied and a
patch to use random_r() in lieu of random, the amount of system time
used per SELECT-only transaction at 48 clients was 3.59 times as much
as it was at 4 clients. And the amount used per transaction at 52
clients was 3.63 times the amount used per transaction at 48 clients.
And the amount used at 56 clients was 3.25 times the amount used at 52
clients. You can see the throughput graph starting to flatten out in
the 32-44 client range, but it's not particularly alarming. However,
once you pass that point things rapidly get totally out of control in
a real hurry. A few more clients and the machine is basically doing
nothing but spin.
> I am wondering whether kernel spinlocks are broken.
I don't think so. Stefan Kaltenbrunner had one profile where he
showed something like sixty or eighty percent of the usermode CPU time
in s_lock. I didn't have access to that particular hardware, but the
testing I've done strongly suggests that most of that was the
SInvalReadLock spinlock. And before I patched pgbench to avoid
calling random(), that was doing the same thing - literally flattening
a 64-core box fighting over a single futex that normally costs almost
nothing. (That one wasn't quite as bad because the futex actually
deschedules the waiters, but it was still bad.) I'm actually not
really sure why it shakes out this way (birthday paradox?) but having
seen the effect several times now, I'm disinclined to believe it's an
artifact.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2011-08-04 01:17:21 | Re: Compressing the AFTER TRIGGER queue |
Previous Message | Peter Geoghegan | 2011-08-03 22:58:51 | Re: Further news on Clang - spurious warnings |