Re: backup manifests

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: David Steele <david(at)pgmasters(dot)net>
Cc: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: backup manifests
Date: 2019-11-25 17:21:56
Message-ID: CA+TgmoaEquK9dYZWA4Z3TdDXsu4a72etCjG0Sk-jZLRt7HMsSA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 22, 2019 at 2:02 PM David Steele <david(at)pgmasters(dot)net> wrote:
> > Except - and this gets back to the previous point - I don't want to
> > slow down backups by 40% by default. I wouldn't mind slowing them down
> > 3% by default, but 40% is too much overhead. I think we've gotta
> > either the overhead of using SHA way down or not use SHA by default.
>
> Maybe -- my take is that the measurements, an uncompressed backup to the
> local filesystem, are not a very realistic use case.

Well, compression is a feature we don't have yet, in core. So for
people who are only using core tools, an uncompressed backup is a very
realistic use case, because it's the only kind they can get. Granted
the situation is different if you are using pgbackrest.

I don't have enough experience to know how often people back up to
local filesystems vs. remote filesystems mounted locally vs. overtly
over-the-network. I sometimes get the impression that users choose
their backup tools and procedures with, as Tom would say, the aid of a
dart board, but that's probably the cynic in me talking. Or maybe a
reflection of the fact that I usually end up talking to the users for
whom things have gone really, really badly wrong, rather than the ones
for whom things went as planned.

> However, I'm still fine with leaving the user the option of checksums or
> no. I just wanted to point out that CRCs have their limits so maybe
> that's not a great option unless it is properly caveated and perhaps not
> the default.

I think the default is the sticking point here. To me, it looks like
CRC is a better default than nothing at all because it should still
catch a high percentage of issues that would otherwise be missed, and
a better default than SHA because it's so cheap to compute. However,
I'm certainly willing to consider other theories.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-11-25 17:43:18 Re: backup manifests
Previous Message Robert Haas 2019-11-25 17:11:53 Re: backup manifests