Re: pg_verifybackup: TAR format backup verification

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amul Sul <sulamul(at)gmail(dot)com>
Cc: Sravan Kumar <sravanvcybage(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_verifybackup: TAR format backup verification
Date: 2024-08-14 16:44:19
Message-ID: CA+TgmoYOQex-MuAfh+Fe1dRKT+O=nH+fPYJ6bKpHfUxgZpumew@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 14, 2024 at 9:20 AM Amul Sul <sulamul(at)gmail(dot)com> wrote:
> I agree with keeping verify_backup_file() separate, but I'm hesitant
> about doing the same for verify_backup_directory().

I don't have time today to go through your whole email or re-review
the code, but I plan to circle back to that at a later time, However,
I want to respond to this point in the meanwhile. There are two big
things that are different for a tar-format backup vs. a
directory-format backup as far as verify_backup_directory() is
concerned. One is that, for a directory format backup, we need to be
able to recurse down through subdirectories; for tar-format backups we
don't. So a version of this function that only handled tar-format
backups would be somewhat shorter and simpler, and would need one
fewer argument. The second difference is that for the tar-format
backup, you need to make a list of the files you see and then go back
and visit each one a second time, and for a directory-format backup
you don't need to do that. It seems to me that those differences are
significant enough to warrant having two separate functions. If you
unify them, I think that less than half of the resulting function is
going to be common to both cases. Yeah, a few bits of logic will be
duplicated, like the error handling for closedir(), the logic to skip
"." and "..", and the psprintf() to construct a full pathname for the
directory entry. But that isn't really very much code, and it's code
that is pretty straightforward and also present in various other
places in the PostgreSQL source tree, perhaps not in precisely the
same form. The fact that two functions both call readdir() and do
something with each file in the directory isn't enough to say that
they should be the same function, IMHO.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2024-08-14 17:20:40 Re: Remove dependence on integer wrapping
Previous Message David Fetter 2024-08-14 16:31:44 Re: [PATCH] Add get_bytes() and set_bytes() functions