Re: Checkpoint cost, looks like it is WAL/CRC

From: "Magnus Hagander" <mha(at)sollentuna(dot)net>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <josh(at)agliodbs(dot)com>
Cc: "Postgres Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Checkpoint cost, looks like it is WAL/CRC
Date: 2005-06-30 10:50:08
Message-ID: 6BCB9D8A16AC4241919521715F4D8BCE6C773C@algol.sollentuna.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> 2. Think of a better defense against partial-page writes.
>
> I like #2, or would if I could think of a better defense.
> Ideas anyone?

FWIW, MSSQL deals with this using "Torn Page Detection". This is off by
default (no check at all!), but can be abled on a per-database level.
Note that it only *detects* torn pages. If it finds one, it won't start
and tell you to recover from backup. It can't automatically recover. I
would assume this greatly decreases the amount of data you have to
save...

From the BOL:
"
This option allows SQL Server to detect incomplete I/O operations caused
by power failures or other system outages. When true, it causes a bit to
be flipped for each 512-byte sector in an 8-kilobyte (KB) database page
whenever the page is written to disk. If a bit is in the wrong state
when the page is later read by SQL Server, then the page was written
incorrectly; a torn page is detected. Torn pages are usually detected
during recovery because any page that was written incorrectly is likely
to be read by recovery.

Although SQL Server database pages are 8 KB, disks perform I/O
operations using a 512-byte sector. Therefore, 16 sectors are written
per database page. A torn page can occur if the system fails (for
example, due to power failure) between the time the operating system
writes the first 512-byte sector to disk and the completion of the 8 KB
I/O operation. If the first sector of a database page is successfully
written before the failure, the database page on disk will appear as
updated, although it may not have succeeded.

Using battery-backed disk controller caches can ensure that data is
successfully written to disk or not written at all. In this case, do not
set torn page detection to true, for it is not needed.
"

and some FAQs:
"
Q. Does enabling the "torn page" database option add any measurable
performance overhead to a server?

A. The torn page option does not add much CPU cost at all, but it can
increase contention on "hot" pages. With torn page detection off, a page
can be accessed while it is being written to disk. This is not true if
torn page detection is on.

Q. When does SQL Server check for torn pages? At startup? Any time it
reads a page from disk?

A. Torn page detection is done whenever a page is read from disk. In
practice, this is likely to be during recovery, because any page on
which the write did not complete during normal operations is very likely
to be read by recovery (except for non-logged operations, such as index
creation, bcp, and so on).

Q. What happens when SQL Server detects a torn page?

A. When a torn page is detected, a severe I/O error is raised. This
error will close the connection. The database is only marked suspect if
the torn page is detected during recovery.

Q. How can I recover from torn pages?

A. Restoring the database from a backup and rolling the transaction log
forward should correct the problem with no data loss.

Q. What situations are most likely to cause torn pages?

A. Lab tests have shown that torn pages are quite likely to happen when
disk writes are occurring and power is lost. If you do not have a
battery backup or uninterruptible power supply (UPS), you should
consider enabling this option.
"

Perhaps a GUC that would let you chose between "full recovery, detection
or none at all"? Though that might complicate the code...

//Magnus

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Oleg Bartunov 2005-06-30 10:59:21 Re: contrib/rtree_gist into core system?
Previous Message Fabien COELHO 2005-06-30 09:48:17 Re: [PATCHES] Users/Groups -> Roles