Re: Enable data checksums by default

From: Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>
To: Greg Sabino Mullane <htamfids(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(at)eisentraut(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Enable data checksums by default
Date: 2024-08-15 07:49:26
Message-ID: CAKZiRmw5m9GZCV0V4O2oj5JgabbFqNrtFnqtH5KQMwMK36zBWg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Greg and others

On Tue, Aug 13, 2024 at 4:42 PM Greg Sabino Mullane <htamfids(at)gmail(dot)com> wrote:
>
> On Thu, Aug 8, 2024 at 6:11 AM Peter Eisentraut <peter(at)eisentraut(dot)org> wrote:
>
>>
>> My understanding was that the reason for some hesitation about adopting data checksums was the performance impact. Not the checksumming itself, but the overhead from hint bit logging. The last time I looked into that, you could get performance impacts on the order of 5% tps. Maybe that's acceptable, and you of course can turn it off if you want the extra performance. But I think this should be discussed in this thread.
>
>
> Fair enough. I think the performance impact is acceptable, as evidenced by the large number of people that turn it on. And it is easy enough to turn it off again, either via --no-data-checksums or pg_checksums --disable. I've come across people who have regretted not throwing a -k into their initial initdb, but have not yet come across someone who has the opposite regret. When I did some measurements some time ago, I found numbers much less than 5%, but of course it depends on a lot of factors.

Same here, and +1 to data_checksums=on by default for new installations.

The best public measurement of the impact was posted in [1] in 2019 by
Tomas to the best of my knowledge, where he explicitly mentioned the
problem with more WAL with hints/checksums: SATA disks (low IOPS). My
take: now we have 2024, and most people are using at least SSDs or
slow-SATA (but in cloud they could just change the class of I/O if
required to get IOPS to avoid too much throttling), therefore the
price of IOPS dropped significantly.

>> About the claim that it's already the de-facto standard. Maybe that is approximately true for "serious" installations. But AFAICT, the popular packagings don't enable checksums by default, so there is likely a significant middle tier between "just trying it out" and serious
>> production use that don't have it turned on.
>
>
> I would push back on that "significant" a good bit. The number of Postgres installations in the cloud is very likely to dwarf the total package installations. Maybe not 10 years ago, but now? Maybe someone from Amazon can share some numbers. Not that we have any way to compare against package installs :) But anecdotally the number of people who mention RDS etc. on the various fora has exploded.

Same here. If it helps the case the: 43% of all PostgreSQL DBs
involved in any support case or incident in EDB within last year had
data_checksums=on (at least if they had collected the data using our )
. That's a surprisingly high number (for something that's off by
default), and it makes me think this is because plenty of customers
are either managed by DBAs who care, or assisted by consultants when
deploying, or simply using TPAexec [2] which has this on by default.

Another thing is plenty of people run with wal_log_hints=on (without
data_checksums=off) just to have pg_rewind working. As this is a
strictly standby related tool it means they don't have WAL/network
bandwidth problems, so the WAL rate is not that high in the wild to
cause problems. I found 1 or 2 cases within last year where we would
mention that high WAL generation was attributed to
wal_log_hints=on/XLOG_FPI and they still didn't disable it apparently
(we have plenty of cases related to too much WAL, but it's mostly due
to other basic reasons)

-J.

[1] - https://www.postgresql.org/message-id/20190330192543.GH4719%40development
[2] - https://www.enterprisedb.com/docs/pgd/4/deployments/tpaexec/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-08-15 07:50:15 Re: format_datum debugging function
Previous Message Jakub Wartak 2024-08-15 07:49:04 Re: Enable data checksums by default