Brian Hamlin <maplabs(at)light42(dot)com> wrote:
> On Mar 16, 2012, at 7:17 AM, Kevin Grittner wrote:
>> Andy Colson <andy(at)squeakycode(dot)net> wrote:
>>
>>> I tried shared_buffers at both 2400M and 18000M, and it took 4.5
>>> hours both times. ... (weak attempts at humor omitted) ....
Ah, I didn't pick up on the attempts at humor; perhaps that's why
you mistook something I said as an attempt at an insult. We get
posts here from people at all different levels of experience, and
many people are grateful for pointers on what various utilities can
do for them or how best to formulate a post so they can get help
when they need it. Attempts to help don't constitute insults, even
if the need is feigned.
>> All else being the same, adjusting shared_buffers affects how
>> much of your cache is managed by PostgreSQL and how much of your
>> cache is managed by the OS; it doesn't exactly change how much
>> you have cached or necessarily affect disk waits.
> Linux caching is aggressive already.. so I think this example
> points out that Postgres caching is not contributing here.. thats
> why I posted this short example to this list.. I thought ti was a
> useful data point.. that it might be useful to others... and to
> the PostgreSQL project devs...
Yeah, guidelines for shared_buffers in the docs are vague because
the best setting varies so much with the workload. While the docs
hint at setting it at 25% of the computer's RAM, most benchmarks
posted on this list have found throughput to peak at around 8GB to
10GB on system where 25% would be more than that. (Far less on
Windows, as the docs mention.)
There can be a point well before that where there are latency
spikes. In our shop we have a multi-TB database backing a web site,
and to prevent these latency spikes we keep shared_buffers down to
2GB even though the system has 128GB RAM. Forcing dirty pages out
to the OS cache helps them to be written in a more timely manner by
code which knows something about the hardware and what order of
writes will be most efficient. PostgreSQL has, as a matter of a
design choice, decided to leave a lot to the OS caching, file
systems, and device drivers, and a key part of tuning is to discover
what balance of that versus the DBMS caching performs best for your
workload.
> some of the queries have been gone over fairly well, other not..
> Its a complex sequence and we are in production mode here,
> so I dont get a chance to do everything I might do with regard to
> one particular query...
You may want to take a look at auto_explain:
http://www.postgresql.org/docs/current/interactive/auto-explain.html
Since you're already in production it may be hard to test the
performance of your disk system, but it's a pretty safe bet that if
you are at all disk-bound you would benefit greatly from adding one
more drive and converting your 3 drive RAID 5 to a 4 drive RAID 10,
preferably with a RAID controller with BBU cache configured for
write-back.
-Kevin