From: | Dawid Kuroczko <qnex42(at)gmail(dot)com> |
---|---|
To: | Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, Russell Smith <mr-russ(at)pws(dot)com(dot)au>, josh(at)agliodbs(dot)com, Postgres Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Checkpoint cost, looks like it is WAL/CRC |
Date: | 2005-07-08 09:41:23 |
Message-ID: | 758d5e7f0507080241493c9d1d@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 7/7/05, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> wrote:
> One idea would be to just tie its behavior directly to fsync and remove
> the option completely (that was the original TODO), or we can adjust it
> so it doesn't have the same risks as fsync, or the same lack of failure
> reporting as fsync.
I wonder about one thing -- how much impact has the underlying filesystem?
I mean, the problem with "partial writes" to pages is how to handle a situation
when the machine looses power and we are not sure if the write was
completed or not.
But then again, imagine the data is on a filesystem with data journaling
(like ext3 with data=journal). There, to my understanding, the data is
first written into journal prior to be written to disk drive. Assuming the
drive looses power during the process, I guess there would be two
possible situations:
1) the modification was committed to journal completely, so we can replay
the journal and we are sure the 8kb block is fine. (*)
2) the modification in the journal is not complete. It has not been fully
committed to the filesystem journal. And we are safe to assume that
drive has an old data.
(*) I am not sure if it is true for 8kb-blocks, and of course, I haven't got
good knowledge about ext3's journalling and its atomicity...
Assuming above are true, it would be interesting to see how ext3
with data=journal and partial writes competes with ext3 data=someother
without it.
I don't have extensive knowledge with journalling internals, but I thought
I would mention it, so people with wider knowledge could put their
input here.
Regards,
Dawid
From | Date | Subject | |
---|---|---|---|
Next Message | Stephen Frost | 2005-07-08 13:06:53 | Re: Must be owner to truncate? |
Previous Message | Simon Riggs | 2005-07-08 09:17:51 | Re: Checkpoint cost, looks like it is WAL/CRC |