From: | Sergey Koposov <koposov(at)ast(dot)cam(dot)ac(dot)uk> |
---|---|
To: | Florian Pflug <fgp(at)phlo(dot)org> |
Cc: | Merlin Moncure <mmoncure(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net> |
Subject: | Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile |
Date: | 2012-05-30 23:16:27 |
Message-ID: | alpine.LRH.2.02.1205302252020.6351@calx046.ast.cam.ac.uk |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, 30 May 2012, Florian Pflug wrote:
>
> I wonder if the huge variance could be caused by non-uniform
> synchronization costs across different cores. That's not all that
> unlikely, because at least some cache levels (L2 and/or L3, I think) are
> usually shared between all cores on a single die. Thus, a cache bouncing
> line between cores on the same die might very well be faster then it
> bouncing between cores on different dies.
>
> On linux, you can use the taskset command to explicitly assign processes
> to cores. The easiest way to check if that makes a difference is to
> assign one core for each connection to the postmaster before launching
> your test. Assuming that cpu assignment are inherited to child
> processes, that should then spread your backends out over exactly the
> cores you specify.
Wow, thanks! This seems to be working to some extend. I've found that
distributing each thread x ( 0<x<7) to the cpu 1+3*x
(reminder, that i have HT disabled and in total I have 4 cpus with 6
proper cores each) gives quite good results. And after a few runs, I seem
to be getting a more or less stable results for the multiple threads,
with the performance of multithreaded runs going from 6 to 11 seconds for
various threads. (another reminder is that 5-6 seconds is roughly the
timing of a my queries running in a single thread).
So to some extend one can say that the problem is partially solved (i.e.
it is probably understood)
But the question now is whether there is a *PG* problem here or
not, or is it Intel's or Linux's problem ?
Because still the slowdown was caused by locking. If there wouldn't be
locking there wouldn't be any problems (as demonstrated a while ago by
just cat'ting the files in multiple threads).
Cheers,
S
*****************************************************
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/
From | Date | Subject | |
---|---|---|---|
Next Message | Fujii Masao | 2012-05-30 23:33:33 | Re: WalSndWakeup() and synchronous_commit=off |
Previous Message | David E. Wheeler | 2012-05-30 23:07:23 | Re: We're not lax enough about maximum time zone offset from UTC |