From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Martijn van Oosterhout <kleptog(at)svana(dot)org> |
Cc: | Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Block-level CRC checks |
Date: | 2008-11-13 22:31:59 |
Message-ID: | 23659.1226615519@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> Actually, the real problem to me seems to be that to check the checksum
> when you read the page in, you need to look at the contents of the page
> and "assume" some of the values in there are correct, before you can
> even calculate the checksum. If the page really is corrupted, chances
> are the item pointers are going to be bogus, but you need to read them
> to calculate the checksum...
Hmm. You could verify the values closely enough to ensure you don't
crash while redoing the CRC calculation, which ought to be sufficient.
Still, I agree that the whole thing looks too Rube Goldbergian to count
as a reliability enhancer, which is what the point is after all.
> Double-buffering allows you to simply checksum the whole page, so
> creating a COMP_CRC32_WITH_COPY() macro would do it. Just allocate a
> block on the stack, copy/checksum it there, do the write() syscall and
> forget it.
I think the argument is about whether we increase our vulnerability to
torn-page problems if we just add a CRC and don't do anything else to
the overall writing process. Right now, a partial write on a
hint-bit-only update merely results in some hint bits getting lost
(as long as you discount the scenario where the disk fails to read a
partially-written sector at all --- maybe we're fooling ourselves to
ignore that?). With a CRC added, that suddenly becomes a corrupted-page
situation, and it's not easy to tell that no real harm was done.
Again, the real bottom line here is whether there will be a *net*
gain in reliability. If a CRC adds too many false-positive
reports of bad data, it's not going to be a win.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2008-11-13 22:40:16 | Re: Simple postgresql.conf wizard |
Previous Message | Simon Riggs | 2008-11-13 22:22:09 | Re: Simple postgresql.conf wizard |