Re: Now I am back, next thing. Final PGS tuning.

From: Bill Moran <wmoran(at)potentialtech(dot)com>
To: Jennifer Trey <jennifer(dot)trey(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Now I am back, next thing. Final PGS tuning.
Date: 2009-04-08 14:23:53
Message-ID: 20090408102353.dd42023a.wmoran@potentialtech.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

In response to Jennifer Trey <jennifer(dot)trey(at)gmail(dot)com>:
>
> I have 8GB memory, Intel Quad Core 2.4Ghz with 8MB L2 cache. I am running
> Windows Web Server 2008 x64 and will be running a Java (64 bit version)
> application.
>
> I want to give the java app room for working on 2-3GB. The operating system
> is currently consuming around 1GB but lets give it a little more room. Lets
> give it a total of 2GB.
>
> That leaves 8-3-2 = 3 GB for PostgreSQL alone. Sounds good?
>
> Here is my config file :
>
> http://85.235.31.35/resources/postgresql.conf
>
> I see there is a setting
>
> *max_connections = 100*
>
> What does this do? Should I be looking at this as max similtaneous queries ?
> is 100 really enough? I think I want to max this more.
>
> I am looking for a worst scenario around like 50-100 similitaneous user
> clicks (per second?). But the querying might be around like 200 queries per
> seocond, not really, but I want to be prepared. :)

Depends on how long your "clicks" take to process. If you're doing 100
page views (clicks) /second and each view takes 2 seconds to process, you're
tying up 200 connections on a continual basis.

Unless you're using some sort of connection pooling ... I'm no Java expert,
but doesn't Java have connection pooling built in? If so, it becomes
more difficult to estimate the # of simultaneous connections because each
instance of a running script might share a connection with other scripts.

In that case, you'll probably have to test to see what a good max is, as
it's going to be difficult or impossible to estimate.

In any event, 100 is probably a good starting point (based on my
experience). Note that if you find that you have to raise that value too
high, (much over a few hundred) then you probably want to investigate some
form of connection pooling, such as pgpool.

> Here is some other settings I am thinking about :
>
> *effective_cache_size = 449697*

What version of Postgres? In modern versions, you can specify MB, GB, etc.

This value should be the memory that's left unused when everything is
running (including Postgres). It helps the planner estimate how much of
the filesystem is cached in memory. Based on the other numbers you've
mentioned, this should probably be set to about 2G.

> *maintenance_work_mem = 16384 *
>
> *work_mem = 1024 # I think this is kb. Way to low, right? What is a better
> value?*

I haven't noticed much value in tweaking this. It only affects a few
commands, such as vacuum and analyze. Test to see if tweaking it speeds
up vacuum without pushing the system into swap.

> *shared_buffers = 1024 # min 128kB or max_connections*16kB ## Also to low.
> Right? I've got 3GB to work with!*

Assuming that's equating to 1G, then the value is about right. Common
best practice is to set this value to 1/4 - 1/3 of the memory available
for PostgreSQL. You're saying you'll have ~3G for PG, so 1G is about
right to start with.

Once the system is up and running, you can install pg_buffercache to
monitor usage and help tune it.

> *wal_buffers = 256 # Also kB...*
>
> Please give your thoughts. I was also wondering about the Vacuum, force
> reindex and stuff. Are those things good to run once in a while? Force
> sounds a little brutal though!

Turn on autovacuum. I've found it's the best way to go in 99% of installs
(the corner cases being servers that have _very_ predictable workloads ...
in which case explicit, scheduled vacuums are better).

REINDEXing is an occasional topic of discussion. Doing it occasionally
definitely saves disk space on frequently updated databases, but the
impact (if any) on performance is a subject for debate. I've yet to see
any drastic performance improvement from REINDEXing, but if you've got
obvious off-peak times (i.e., if nobody uses the system over weekends or
something) it probably doesn't hurt to reindex everything on a regular
schedule. Don't obsess over it, though.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Massa, Harald Armin 2009-04-08 14:38:24 Re: Now I am back, next thing. Final PGS tuning.
Previous Message Adrian Klaver 2009-04-08 14:12:29 Re: pg_dump/psql: Select a server and automate password