From: | Erik Jones <erik(at)myemma(dot)com> |
---|---|
To: | "dimitri k" <dimitrik(dot)fr(at)gmail(dot)com> |
Cc: | "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "PostgreSQL Performance" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Shared buffers, db transactions commited, and write IO on Solaris |
Date: | 2007-03-29 18:58:13 |
Message-ID: | 606E8491-AAFB-493E-8FFA-FF26D6DD5ACE@myemma.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Mar 29, 2007, at 12:41 PM, dimitri k wrote:
> On 3/29/07, Erik Jones <erik(at)myemma(dot)com> wrote:
>> On Mar 29, 2007, at 11:16 AM, Tom Lane wrote:
>>
>> > Erik Jones <erik(at)myemma(dot)com> writes:
>> >> We've recently made a couple changes to our system that have
>> resulted
>> >> in a drastic increase in performance as well as some very
>> confusing
>> >> changes to the database statistics, specifically
>> >> pg_stat_database.xact_commit. Here's the details:
>> >
>> > I'm kinda boggled too. I can see how increasing shared buffers
>> could
>> > result in a drastic reduction in write rate, if the working set of
>> > your
>> > queries fits in the new space but didn't fit in the old. I have no
>> > idea
>> > how that leads to a drop in number of transactions committed
>> though.
>> > It doesn't make sense that autovac would run less frequently,
>> because
>> > it's driven by number of tuples changed not number of disk
>> writes; and
>> > that could hardly account for a 10x drop anyway.
>> >
>> > Did you by any chance take note of exactly which processes were
>> > generating all the I/O or the CPU load?
>>
>> Well, wrt to the CPU load, as I said, we're pretty sure that's
>> autovac as we still get spikes that hit about the same threshold,
>> after which cache hits go up dramatically and the spikes just don't
>> last two days anymore.
>>
>> As far as the procs responsible for the writes go, we were unable to
>> see that from the OS level as the guy we had as a systems admin last
>> year totally screwed us with the way he set up the SunCluster on the
>> boxes and we have been unable to run Dtrace which has left us
>> watching a lot of iostat. However, we did notice a direct
>> correlation between write spikes and "write intensive" queries like
>> large COPYs, UPDATEs, and INSERTs.
>>
>> One very important thing to note here is that the number, or rather
>> rate, of disk writes has not changed. It's the volume of data in
>> those writes that has dropped, along with those transaction
>> mysterious counts. Could the bgwriter be the culprit here? Does
>> anything it does get logged as a transaction?
>>
>> erik jones <erik(at)myemma(dot)com>
>> software developer
>> 615-296-0838
>> emma(r)
>>
>>
>>
>>
> Erik,
>
> using 'forcedirectio' simply brings your write operations to the
> *real* volume - means while you need to write 10 bytes you'll write 10
> bytes (instead of UFS block size (8K)). So it explains me why your
> write volume became slower.
Sorry, that's not true. Google "ufs forcedirectio" go to the first
link and you will find:
"forcedirectio
The forcedirectio (read "force direct IO") UFS option causes data to
be buffered in kernel address whenever data is transferred between
user address space and the disk. In other words, it bypasses the file
system cache. For certain types of applications -- primarily database
systems -- this option can dramatically improve performance. In fact,
some database experts have argued that a file using the forcedirectio
option will outperform a raw partition, though this opinion seems
fairly controversial.
The forcedirectio improves file system performance by eliminating
double buffering, providing a small, efficient code path for file
system reads and writes and removing pressure on memory."
However, what this does mean is that writes will be at the actual
filesystem block size and not the cache block size (8K v. 512K).
>
> Now, why TX number is reduced - is a small mystery :)
>
> Options:
> - you really do 10 times less commits, means you work 10 times
> slower? ;)
> what about users? how do you measure your work performance?
We are an email marketing service provider with a web front end
application. We measure work performance via web requests (counts,
types, etc...), mailer activity and the resulting database activity.
We are doing as much or more work now than previously, and faster.
>
> - TX reported in pg_* tables are not exact, but I don't believe
> at all :)
Even if they aren't exact, being off by a factor of 10 wouldn't be
believable. the forcedirectio mount setting for ufs can definitely
explain the drop in data written volume, but doesn't do much to
explain the difference in xact commits.
erik jones <erik(at)myemma(dot)com>
software developer
615-296-0838
emma(r)
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2007-03-29 19:19:49 | Re: Shared buffers, db transactions commited, and write IO on Solaris |
Previous Message | Vincenzo Romano | 2007-03-29 18:56:07 | Weird performance drop |