CRCs

From: ncm(at)zembu(dot)com (Nathan Myers)
To: pgsql-hackers(at)postgresql(dot)org
Subject: CRCs
Date: 2001-01-12 20:35:14
Message-ID: 20010112123514.A7251@store.zembu.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Vadim wrote:
> Tom wrote:
> > Bruce wrote:
> > > ... If the CRC on
> > > the WAL log checks for errors that are not checked anywhere else,
> > > then fine, but I thought disk CRC would just duplicate the I/O
> > > subsystem/disk.
> >
> > A disk-block CRC would detect partially written blocks (ie,
> > power drops after disk has written M of the N sectors in a
> > block). The disk's own checks will NOT consider this condition a
> > failure. I'm not convinced that WAL will reliably detect it either
> > (Vadim?).
>
> Idea proposed by Andreas about "physical log" is implemented! Now WAL
> saves whole data blocks on first after checkpoint modification. This
> way on recovery modified data blocks will be first restored *as a
> whole*. Isn't it much better than just detection of partially writes?

This seems to protect against some partial writes, but see below.

> > Certainly WAL will not help for corruption caused by external agents,
> > away from any updates that are actually being performed/logged.
>
> What do you mean by "external agents"?

External agents include RAM bit drops and noise on cables when
blocks are (read and re-) written. Every time data is moved,
there is a chance of an undetected error being introduced. The
disk only promises (within limits) to deliver the sector that
was written; it doesn't promise that what was written is what
you meant to write. Errors of this sort accumulate unless
caught by end-to-end checks.

External agents include bugs in database code, bugs in OS code,
bugs in disk controller firmware, and bugs in disk firmware.
Each can result in clobbered data, blocks being written in the
wrong place, blocks said to be written but not, and any number
of other variations. All this code is written by humans, and
even the most thorough testing cannot cover even the majority
of code paths.

External agents include sector errors not caught by the disk CRC:
the disk only promises to keep the number of errors delivered to a
reasonably low (and documented) level. It's up to the user to
notice the errors that slip through.

and Andreas wrote:
> > A disk-block CRC would detect partially written blocks (ie, power
> > drops after disk has written M of the N sectors in a block). The
> > disk's own checks will NOT consider this condition a failure.
>
> But physical log recovery will rewrite every page that was changed
> after last checkpoint, thus this is not an issue anymore.

No. That assumes that when the drive _says_ the block is written,
it is really on the disk. That is not true for IDE drives. It is
true for SCSI drives only when the SCSI spec is implemented correctly,
but implementing the spec correctly interferes with favorable benchmark
results.

> > I'm not convinced that WAL will reliably detect it either
> > (Vadim?). Certainly WAL will not help for corruption caused by
> > external agents, away from any updates that are actually being
> > performed/logged.
>
> The external agent (if malvolent) could write a correct CRC anyway
> If on the other hand the agent writes complete garbage, vacuum will
> notice.

Vacuum does not check most of the bits in the blocks it reads.
(Bad bits in metadata will cause a crash only if you're lucky.
If not, they result in more corruption.)

A database is unusual among computer applications in that an error
introduced today can sit unnoticed on the disk, and then result in
an unnoticed wrong answer six months later. We need to be able to
detect bad bits as soon as possible, before the backups have been
overwritten. CRCs are how we can detect cumulative corruption from
all sources.

Nathan Myers
ncm(at)zembu(dot)com

Responses

  • Re: CRCs at 2001-01-12 22:01:16 from Nathan Myers

Browse pgsql-hackers by date

  From Date Subject
Next Message Martin A. Marques 2001-01-12 20:39:11 problems with pg_geqo
Previous Message Martin A. Marques 2001-01-12 20:34:07 Re: still no log