From: | David Steele <david(at)pgmasters(dot)net> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | Justin Pryzby <pryzby(at)telsasoft(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Suraj Kharage <suraj(dot)kharage(at)enterprisedb(dot)com>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, Tels <nospam-pg-abuse(at)bloodgate(dot)com>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, vignesh C <vignesh21(at)gmail(dot)com> |
Subject: | Re: documenting the backup manifest file format |
Date: | 2020-04-13 20:42:03 |
Message-ID: | 3d49cad1-43f3-4c8a-1676-fb552892e629@pgmasters.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 4/13/20 4:14 PM, Robert Haas wrote:
> On Mon, Apr 13, 2020 at 3:34 PM Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
>
>> Also, I
>> see no mention of prettification-chars such as newlines or indentation.
>> I suppose if I pass a manifest file through prettification (or Windows
>> newline conversion), the checksum may break.
>
> It would indeed break. I'm not sure what you want me to say here,
> though. If you're trying to parse a manifest, you shouldn't care about
> how the whitespace is arranged. If you're trying to generate one, you
> can arrange it any way you like, as long as you also include it in the
> checksum.
pgBackRest ignores whitespace but this is a legacy of the way Perl
calculated checksums, not an intentional feature. This worked well when
the manifest was loaded as a whole, converted to JSON, and checksummed,
but it is a major pain for the streaming code we now have in C.
I guarantee that that our next manifest version will do a simple
checksum of bytes as Robert has done in this feature.
So, I'm +1 as implemented.
>> Why is the top-level checksum only allowed to be SHA-256, if the files
>> can use up to SHA-512?
<snip>
> I agree that it's a little bit weird that you can have a stronger
> checksum for the files instead of the manifest itself, but I also
> wonder what the use case would be for using a stronger checksum on the
> manifest. David Steele argued that strong checksums on the files could
> be useful to software that wants to rifle through all the backups
> you've ever taken and find another copy of that file by looking for
> something with a matching checksum. CRC-32C wouldn't be strong enough
> for that, because eventually you could have enough files that you
> start to have collisions. The SHA algorithms output enough bits to
> make that quite unlikely. But this argument only makes sense for the
> files, not the manifest.
Agreed. I think SHA-256 is *more* than enough to protect the manifest
against corruption. That said, since the cost of SHA-256 vs. SHA-512 in
the context on the manifest is negligible we could just use the stronger
algorithm to deflect a similar question going forward.
That choice might not age well, but we could always say, well, we picked
it because it was the strongest available at the time. Allowing a choice
of which algorithm to use for to manifest checksum seems like it will
just make verifying the file harder with no tangible benefit.
Maybe just a comment in the docs about why SHA-256 was used would be fine.
>> (Also, did we intentionally omit the dash in
>> hash names, so "SHA-256" to make it SHA256? This will also be critical
>> for checksumming the manifest itself.)
>
> I debated this with myself, settled on this spelling, and nobody
> complained until now. It could be changed, though. I didn't have any
> particular reason for choosing it except the feeling that people would
> probably prefer to type --manifest-checksum=sha256 rather than
> --manifest-checksum=sha-256.
+1 for sha256 rather than sha-256.
Regards,
--
-David
david(at)pgmasters(dot)net
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2020-04-13 20:57:19 | Re: Poll: are people okay with function/operator table redesign? |
Previous Message | Tom Lane | 2020-04-13 20:41:51 | Re: Poll: are people okay with function/operator table redesign? |