Re: pg_verifybackup: TAR format backup verification

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Amul Sul <sulamul(at)gmail(dot)com>, Sravan Kumar <sravanvcybage(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_verifybackup: TAR format backup verification
Date: 2024-09-30 20:05:01
Message-ID: CA+TgmoYAUxW4GzEK4puDtA_HNaWPJxfmgqQX8oDkRM3krGJuUg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 30, 2024 at 11:24 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > This is just a string in a
> > JSON file that represents an integer which will hopefully turn out to
> > be the size of the file on disk. I guess I don't know what type I
> > should be using here. Most things in PostgreSQL use a type like uint32
> > or uint64, but technically what we're going to be comparing against in
> > the end is probably an off_t produced by stat(), but the return value
> > of strtoul() or strtoull() is unsigned long or unsigned long long,
> > which is not any of those things. If you have a suggestion here, I'm
> > all ears.
>
> I don't know if it's realistic to expect that this code might be used
> to process JSON blobs exceeding 4GB. But if it is, I'd be inclined to
> use uint64 and strtoull for these purposes, if only to avoid
> cross-platform hazards with varying sizeof(long) and sizeof(size_t).
>
> Um, wait ... we do have strtou64(), so you should use that.

The thing we should be worried about is not how large a JSON blob
might be, but rather how large any file that appears in the data
directory might be. So uint32 is out; and I think I hear you voting
for uint64 over size_t. But then how do you think we should print
that? Cast to unsigned long long and use %llu?

> >> Aside from that, I'm unimpressed with expending a five-line comment
> >> at line 376 to justify casting control_file_bytes to int,
>
> > I don't know what you mean by this.
>
> I mean that we have a widely-used, better solution. If you copied
> this from someplace else, the someplace else could stand to be
> improved too.

I don't understand what you think the widely-used, better solution is
here. As far as I can see, there are two goods here, between which one
must choose. One can decide to use the same error message string, and
I hope we can agree that's good, because I've been criticized in the
past when I have done otherwise, as have many others. The other good
is to use the most appropriate data type. One cannot have both of
those things in this instance, unless one goes and fixes the other
code also, but such a change had no business being part of this patch.
If the issue had been serious and likely to occur in real life, I
would have probably fixed it in a preparatory patch, but it isn't, so
I settled for adding a comment.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2024-09-30 20:11:58 Re: pg_verifybackup: TAR format backup verification
Previous Message Tom Lane 2024-09-30 19:56:22 Re: set_rel_pathlist function unnecessary params.