Re: Enable data checksums by default

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Sabino Mullane <htamfids(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(at)eisentraut(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Enable data checksums by default
Date: 2024-08-13 20:07:40
Message-ID: CA+Tgmoa_RUVrROy_iOQmWfsHA-YD6J3BsSurkH6t1qT9QLg1kw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 13, 2024 at 10:42 AM Greg Sabino Mullane <htamfids(at)gmail(dot)com> wrote:
> Fair enough. I think the performance impact is acceptable, as evidenced by the large number of people that turn it on. And it is easy enough to turn it off again, either via --no-data-checksums or pg_checksums --disable.
> When I did some measurements some time ago, I found numbers much less than 5%, but of course it depends on a lot of factors.

I think the bad case is when you have a write workload that is
significantly bigger than shared_buffers but still small enough to fit
comfortably in the OS cache. When everything fits in shared_buffers,
you only need to write dirty buffers once per checkpoint cycle, so
making it more expensive isn't necessarily a big deal. When you're
constantly going to disk, that's so expensive that you don't notice
the computational overhead. But when you're in that middle zone where
you keep evicting buffers from PG but not actually having to write
them down to the disk, then I think it's pretty noticeable.

> I've come across people who have regretted not throwing a -k into their initial initdb, but have not yet come across someone who has the opposite regret.

I don't think this is really a fair comparison, because everything
being a little slower all the time is not something that people are
likely to "regret" in the same way that they regret it when a data
corruption issue goes undetected. An undetected data corruption issue
is a single, very painful event that people are likely to notice,
whereas a small performance loss over time kind of blends into the
background. You don't really regret that kind of thing in the same way
that you regret a bad event that happens at a particular moment in
time.

And it's not like we have statistics anywhere that you can look at to
see how much CPU time you spent computing checksums, so if a user DOES
have a performance problem that would not have occurred if checksums
had been disabled, they'll probably never know it.

>> For those uses, this change would render pg_upgrade useless for upgrades from an old instance with default settings to a new instance with default settings. And then users would either need to re-initdb with checksums turned back off, or I suppose run pg_checksums on the old instance before upgrading? This is significant additional complication.
> Meh, re-running initdb with --no-data-checksums seems a fairly low hurdle.

I tend to agree with that, but I would also like to see the sort of
improvements that Peter mentions. It's a lot less work to say "let's
just change the default" and then get mad at anyone who disagrees than
it is to do the engineering to make changing the default less of a
problem. But that kind of engineering really adds a lot of value
compared to just changing the default.

None of that is to say that I'm totally hostile to this change.
Checksums don't actually prevent your data from getting corrupted, or
let you recover it after it does. They just tell you about the
problem, and very often you would have found out anyway. However, they
do have peace-of-mind value. If you've got checksums turned on, you
can verify your checksums regularly and see that they're OK, and
people like that. Whether that's worth the overhead for everyone, I'm
not quite sure.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-08-13 20:13:27 Re: Improve error message for ICU libraries if pkg-config is absent
Previous Message Dmitry Dolgov 2024-08-13 20:06:13 Re: pg_stat_statements and "IN" conditions