Re: Postgres benchmarking with pgbench

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: "ml\(at)bortal(dot)de" <ml(at)bortal(dot)de>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Postgres benchmarking with pgbench
Date: 2009-03-17 00:17:18
Message-ID: 87ljr50zqp.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Greg Smith <gsmith(at)gregsmith(dot)com> writes:

> On Mon, 16 Mar 2009, Gregory Stark wrote:
>
>> Why would checkpoints force out any data? It would dirty those pages and then
>> sync the files marking them clean, but they should still live on in the
>> filesystem cache.
>
> The bulk of the buffer churn in pgbench is from the statement that updates a
> row in the accounts table. That constantly generates updated data block and
> index block pages. If you can keep those changes in RAM for a while before
> forcing them to disk, you can get a lot of benefit from write coalescing that
> goes away if constant checkpoints push things out with a fsync behind them.
>
> Not taking advantage of that effectively reduces the size of the OS cache,
> because you end up with a lot of space holding pending writes that wouldn't
> need to happen at all yet were the checkpoints spaced out better.

Ok, so it's purely a question of write i/o, not reduced cache effectiveness.
I think I could see that. I would be curious to see these results with a
larger checkpoint_segments setting.

Looking further at the graphs I think they're broken but not in the way I had
guessed. It looks like they're *overstating* the point at which the drop
occurs. Looking at the numbers it's clear that under 1GB performs well but at
1.5GBP it's already dropping to the disk-resident speed.

I think pgbench is just not that great a model for real-world usage . a) most
real world workloads are limited by read traffic, not write traffic, and
certainly not random update write traffic; and b) most real-world work loads
follow a less uniform distribution so keeping busy records and index regions
in memory is more effective.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's 24x7 Postgres support!

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Gregory Stark 2009-03-17 00:30:20 Re: High CPU Utilization
Previous Message Scott Marlowe 2009-03-16 22:09:13 Re: High CPU Utilization