Re: backup manifests

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Suraj Kharage <suraj(dot)kharage(at)enterprisedb(dot)com>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, Tels <nospam-pg-abuse(at)bloodgate(dot)com>, David Steele <david(at)pgmasters(dot)net>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: backup manifests
Date: 2020-03-27 19:55:12
Message-ID: 20200327195512.GG13712@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Thu, Mar 26, 2020 at 4:44 PM Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > Is it actually possible, today, in PG, to have a 4GB WAL record?
> > Judging this based on the WAL record size doesn't seem quite right.
>
> I'm not sure. I mean, most records are quite small, but I think if you
> set REPLICA IDENTITY FULL on a table with a bunch of very wide columns
> (and also wal_level=logical) it can get really big. I haven't tested
> to figure out just how big it can get. (If I have a table with lots of
> almost-1GB-blobs in it, does it work without logical replication and
> fail with logical replication? I don't know, but I doubt a WAL record
> >4GB is possible, because it seems unlikely that the code has a way to
> cope with that struct field overflowing.)

Interesting.. Well, topic for another thread, but I'd say if we believe
that's possible then we might want to consider if the crc32c is a good
decision to use still there.

> > Again, I'm not against having a checksum algorithm as a option. I'm not
> > saying that it must be SHA512 as the default.
>
> I think that what we have seen so far is that all of the SHA-n
> algorithms that PostgreSQL supports are about equally slow, so it
> doesn't really matter which one you pick there from a performance
> point of view. If you're not saying it has to be SHA-512 but you do
> want it to be SHA-256, I don't think that really fixes anything. Using
> CRC-32C does fix the performance issue, but I don't think you like
> that, either. We could default to having no checksums at all, or even
> no manifest at all, but I didn't get the impression that David, at
> least, wanted to go that way, and I don't like it either. It's not the
> world's best feature, but I think it's good enough to justify enabling
> it by default. So I'm not sure we have any options here that will
> satisfy you.

I do like having a manifest by default. At this point it's pretty clear
that we've just got a fundamental disagreement that more words aren't
going to fix. I'd rather we play it safe and use a sha256 hash and
accept that it's going to be slower by default, and then give users an
option to make it go faster if they want (though I'd much rather that
alternative be a 64bit CRC than a 32bit one).

Andres seems to agree with you. I'm not sure where David sits on this
specific question.

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2020-03-27 19:56:24 Re: backup manifests
Previous Message Alvaro Herrera 2020-03-27 19:51:49 Re: allow online change primary_conninfo