Re: better page-level checksums

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: better page-level checksums
Date: 2022-06-10 16:20:00
Message-ID: 20220610162000.GU9030@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Fabien COELHO (coelho(at)cri(dot)ensmp(dot)fr) wrote:
> >I think for this purpose we should limit ourselves to algorithms
> >whose output size is, at minimum, 64 bits, and ideally, a multiple of
> >64 bits. I'm sure there are plenty of options other than the ones that
> >btrfs uses; I mentioned them only as a way of jump-starting the
> >discussion. Note that SHA-256 and BLAKE2B apparently emit enormously
> >wide 16 BYTE checksums. That's a lot of space to consume with a
> >checksum, but your chances of a collision are very small indeed.
>
> My 0.02€ about that:
>
> You do not have to store the whole hash algorithm output, you can truncate
> or reduce (eg by xoring parts) the size to what makes sense for your
> application and security requirements. ISTM that 64 bits is more than enough
> for a page checksum, whatever the underlying hash algorithm.

Agreed on this- but we shouldn't be guessing at what the correct answers
are here, there's published information from standards bodies about this
sort of thing.

> Also, ISTM that a checksum algorithm does not really need to be
> cryptographically strong, which means that cheaper alternatives are ok,
> although good quality should be sought nevertheless.

Right, if we aren't doing encryption then we just need to focus on what
is needed for the amount of error detection that we want and we can go
look at how much space we need when we're talking about 8K or so worth
of data. When we *are* doing encryption, what's interesting is the tag
length and that's a different thing which has its own published
information from standards bodies about and we should be looking at
that. While the general "need X amount of space on the page to store
the hash/authentication data" problem is the same, the answer to "how
much space is needed" will depend on which use case the user requested
(well ... probably anyway, maybe we'll get lucky and find that there's a
reasonable answer to both which fits in the same amount of space and
could possibly leverage that, but let's not try to force that to happen
as we'll surely get called out if we go against the guideance from the
standards bodies who study this stuff).

Thanks,

Stephen

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Phil Florent 2022-06-10 16:20:09 Re: Error from the foreign RDBMS on a foreign table I have no privilege on
Previous Message Stephen Frost 2022-06-10 16:13:34 Re: better page-level checksums