Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Michael Banck <michael(dot)banck(at)credativ(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)
Date: 2021-01-06 18:01:59
Message-ID: 20210106180159.GM27507@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Andres Freund (andres(at)anarazel(dot)de) wrote:
> On 2021-01-06 12:02:40 -0500, Stephen Frost wrote:
> > * Andres Freund (andres(at)anarazel(dot)de) wrote:
> > > On 2021-01-04 19:11:43 +0100, Michael Banck wrote:
> > > > Am Samstag, den 02.01.2021, 10:47 -0500 schrieb Stephen Frost:
> > > > > I agree with this, but I'd also like to propose, again, as has been
> > > > > discussed a few times, making it the default too.
> > >
> > > FWIW, I am quite doubtful we're there performance-wise. Besides the WAL
> > > logging overhead, the copy we do via PageSetChecksumCopy() shows up
> > > quite significantly in profiles here. Together with the checksums
> > > computation that's *halfing* write throughput on fast drives in my aio
> > > branch.
> >
> > Our defaults are not going to win any performance trophies and so I
> > don't see the value in stressing over it here.
>
> Meh^3. There's a difference between defaults that are about resource
> usage (e.g. shared_buffers) and defaults that aren't.

fsync isn't about resource usage.

> > > > This looks much better from the WAL size perspective, there's now almost
> > > > no additional WAL. However, that is because pgbench doesn't do TOAST, so
> > > > in a real-world example it might still be quite larger. Also, the vacuum
> > > > runtime is still 15x longer.
> > >
> > > That's obviously an issue.
> >
> > It'd certainly be nice to figure out a way to improve the VACUUM run but
> > I don't think the impact on the time to run VACUUM is really a good
> > reason to not move forward with changing the default.
>
> Vacuum performance is one of *THE* major complaints about
> postgres. Making it run slower by a lot obviously exascerbates that
> problem significantly. I think it'd be prohibitively expensive if it
> were 1.5x, not to even speak of 15x.

We already make vacuum, when run out of autovacuum, relatively slow,
quite intentionally. If someone's having trouble with vacuum run times
they're going to be adjusting the configuration anyway.

> > imv, enabling page checksums is akin to having fsync enabled by default.
> > Does it impact performance? Yes, surely quite a lot, but it's also the
> > safe and sane choice when it comes to defaults.
>
> Oh for crying out loud.

Not sure what you're hoping to gain from such comments, but it doesn't
do anything to change my opinion.

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2021-01-06 18:04:43 Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)
Previous Message Andres Freund 2021-01-06 17:58:37 Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)