From: | Sergey Koposov <koposov(at)ast(dot)cam(dot)ac(dot)uk> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Merlin Moncure <mmoncure(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile |
Date: | 2012-05-24 19:47:10 |
Message-ID: | alpine.LRH.2.02.1205242008390.14366@calx046.ast.cam.ac.uk |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, 24 May 2012, Robert Haas wrote:
> As you can see, raw performance isn't much worse with the larger data
> sets, but scalability at high connection counts is severely degraded
> once the working set no longer fits in shared_buffers.
Actually the problem persits even when I trim the dataset size to be within
the shared_buffers.
Here is the dump (0.5 gig in size, tested with shared_buffers=10G,
work_mem=500Mb):
http://www.ast.cam.ac.uk/~koposov/files/dump.gz
And I attach the script
For my toy dataset the performance of a single thread goes down
from ~6.4 to 18 seconds (~ 3 times worse),
And actually while running the script repeatedly on my main machine, for
some reason I saw some variation in terms of how much threaded execution
is slower than a single thread.
Now I see 25 seconds for multi threaded run vs the same ~ 6 second for a
single thread.
The oprofile shows
782355 21.5269 s_lock
782355 100.000 s_lock [self]
-------------------------------------------------------------------------------
709801 19.5305 PinBuffer
709801 100.000 PinBuffer [self]
-------------------------------------------------------------------------------
326457 8.9826 LWLockAcquire
326457 100.000 LWLockAcquire [self]
-------------------------------------------------------------------------------
309437 8.5143 UnpinBuffer
309437 100.000 UnpinBuffer [self]
-------------------------------------------------------------------------------
252972 6.9606 ReadBuffer_common
252972 100.000 ReadBuffer_common [self]
-------------------------------------------------------------------------------
201558 5.5460 LockBuffer
201558 100.000 LockBuffer [self]
------------------------------------------------------------
It is interesting that On another machine with much smaller shared memory
(3G), smaller RAM (12G), smaller number of cpus and PG 9.1 running I was
getting consistently ~ 7.2 vs 4.5 sec (for multi vs single thread)
PS Just in case the CPU on the main machine I'm testing is Xeon(R) CPU E7-
4807 (the total number of real cores is 24)
*****************************************************
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/
Attachment | Content-Type | Size |
---|---|---|
script.sh | application/x-sh | 306 bytes |
script.sql | text/plain | 258 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2012-05-24 19:54:54 | Re: Backends stalled in 'startup' state: index corruption |
Previous Message | Merlin Moncure | 2012-05-24 19:46:37 | Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile |