Re: Perfomance Tuning

From: Reece Hart <rkh(at)gene(dot)COM>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Perfomance Tuning
Date: 2003-08-11 22:56:07
Message-ID: 1060642567.15483.220.camel@tallac
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, 2003-08-11 at 15:16, Bruce Momjian wrote:

> That _would_ work if ext2 was a reliable file system --- it is not.

Bruce-

I'd like to know your evidence for this. I'm not refuting it, but I'm a
>7 year linux user (including several clusters, all of which have run
ext2 or ext3) and keep a fairly close ear to kernel newsgroups,
announcements, and changelogs. I am aware that there have very
occasionally been corruption problems, but my understanding is that
these are fixed (and quickly). In any case, I'd say that your assertion
is not widely known and I'd appreciate some data or references.

As for PostgreSQL on ext2 and ext3, I recently switched from ext3 to
ext2 (Stephen Tweedy was insightful to facilitate this backward
compatibility). I did this because I had a 45M row update on one table
that was taking inordinate time (killed after 10 hours), even though
creating the database from backup takes ~4 hours including indexing (see
pgsql-perform post on 2003/07/22). CPU usage was ~2% on an otherwise
unloaded, fast, SCSI160 machine. vmstat io suggested that PostgreSQL was
writing something on the order of 100x as many blocks as being read. My
untested interpretation was that the update bookkeeping as well as data
update were all getting journalled, the journal space would fill, get
sync'd, then repeat. In effect, all blocks were being written TWICE just
for the journalling, never mind the overhead for PostgreSQL
transactions. This emphasizes that journals probably work best with
short burst writes and syncing during lulls rather than sustained
writes.

I ended up solving the update issue without really updating, so ext2
timings aren't known. So, you may want to test this yourself if you're
concerned.

-Reece

--
Reece Hart, Ph.D. rkh(at)gene(dot)com, http://www.gene.com/
Genentech, Inc. 650/225-6133 (voice), -5389 (fax)
Bioinformatics and Protein Engineering
1 DNA Way, MS-93 http://www.in-machina.com/~reece/
South San Francisco, CA 94080-4990 reece(at)in-machina(dot)com, GPG: 0x25EC91A0

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Josh Berkus 2003-08-11 22:59:09 Re: Odd problem with performance in duplicate database
Previous Message Peter Darley 2003-08-11 22:51:42 Re: Odd problem with performance in duplicate database