| From: | Greg Smith <gsmith(at)gregsmith(dot)com> | 
|---|---|
| To: | Mark Wong <markwkm(at)gmail(dot)com> | 
| Cc: | "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>, Selena Deckelmann <selenamarie(at)gmail(dot)com>, Gabrielle Roth <gorthx(at)gmail(dot)com> | 
| Subject: | Re: dbt-2 tuning results with postgresql-8.3.5 | 
| Date: | 2009-01-23 03:44:51 | 
| Message-ID: | Pine.GSO.4.64.0901222207330.12661@westnet.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-performance | 
On Thu, 22 Jan 2009, Mark Wong wrote:
> I'm also capturing the PostgreSQL parameters as suggested so we can
> see what's set in the config file, default, command line etc.  It's
> the "Settings" link in the "System Summary" section on the report web
> page.
Those look good, much easier to pick out the stuff that's been customized. 
I note that the Linux "Settings" links seems to be broken though.
To recap a summary here, what you had before were:
shared_buffers=24MB checkpoint_segments=100 notpm=7527
shared_buffers=8192MB checkpoint_segments=3 notpm=7996
And the new spots show:
shared_buffers=7680MB checkpoint_segments=100 notpm=9178
What's neat about your graphs now is that I think you can see the 
checkpoints happening in the response time graphs.  For example, if you 
look at 
http://207.173.203.223/~markwkm/community6/dbt2/pgtune.1000.100.1/report/rt_d.png 
and you focus on what happens just before each 10 minute mark, I'm 
guessing that response time spike is the fsync phase at the end of the 
checkpoint.  That's followed by a period where response time is really 
fast.  That's because those writes are all pooling into the now cleared 
out Linux buffer cache, but pdflush isn't really being aggressive about 
writing them out yet.  On your server that can absorb quite a few writes 
before clients start blocking on them, which is when response time climbs 
back up.
A particularly interesting bit is to compare against the result with the 
peak notpm you had in your earlier tests, where shared_buffers=15360MB: 
http://207.173.203.223/~markwkm/community6/dbt2/shared_buffers/shared_buffers.15360MB/report/rt_d.png
While the average speed was faster on that one, the worst-case response 
time was much worse.  You can really see this by comparing the response 
time distribution.
Big shared buffers but low checkpoint_segments:
http://207.173.203.223/~markwkm/community6/dbt2/shared_buffers/shared_buffers.15360MB/report/dist_d.png
Medium shared buffers and medium checkpoint_segments:
http://207.173.203.223/~markwkm/community6/dbt2/pgtune.1000.100.1/report/dist_d.png
The checkpoint spreading logic is making a lot more transactions suffer 
moderate write delays in order to get a big improvement in worst-case 
behavior.
The next fine-tuning bit I'd normally apply in this situation is to see if 
increasing checkpoint_completion_target from the default (0.5) to 0.9 does 
anything to flatten out that response time graph.  I've seen a modest 
increase in wal_buffers (from the default to, say, 1MB) help smooth out 
the rough spots too.
--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Greg Smith | 2009-01-23 03:52:47 | Re: postgresql 8.3 tps rate | 
| Previous Message | Mark Wong | 2009-01-23 01:44:47 | Re: dbt-2 tuning results with postgresql-8.3.5 |