Re: better page-level checksums

From: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: better page-level checksums
Date: 2022-06-10 13:36:39
Message-ID: 27aa2269-1b36-b48a-f9ea-ff451208ec76@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10.06.22 15:16, Robert Haas wrote:
> I'm not perfectly attached to the idea of using SHA here, but it seems
> to me that's pretty much the standard thing these days. Stephen Frost
> and David Steele pushed hard for SHA checksums in backup manifests,
> and actually wanted it to be the default.

That seems like a reasonable use in that application, since you might
want to verify whether a backup has been (maliciously?) altered rather
than just accidentally bit flipped.

> I think that if you're the kind of person who looks at our existing
> page checksums and finds them too weak, I doubt that CRC-32C is going
> to make you feel any better. You're probably the sort of person who
> thinks that checksums should have a lot of bits, and you're probably
> not going to be satisfied with the properties of an algorithm invented
> in the 1960s. Of course if there's anyone out there who thinks that
> our existing 16-bit checksums are a pile of garbage but would be much
> happier if CRC-32C is an option, I am happy to have them show up here
> and say so, but I find it much more likely that people who want this
> kind of feature would advocate for a more modern algorithm.

I think there ought to be a bit more principled analysis here than just
"let's add a lot more bits". There is probably some kind of information
to be had about how many CRC bits are useful for a given block size, say.

And then there is the question of performance. When data checksum were
first added, there was a lot of concern about that. CRC is usually
baked directly into hardware, so it's about as cheap as we can hope for.
SHA not so much.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2022-06-10 13:58:40 Re: better page-level checksums
Previous Message Kaiting Chen 2022-06-10 13:28:56 Re: Allow foreign keys to reference a superset of unique columns