Re: pg_verifybackup: TAR format backup verification

From: Amul Sul <sulamul(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Sravan Kumar <sravanvcybage(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_verifybackup: TAR format backup verification
Date: 2024-08-21 11:07:49
Message-ID: CAAJ_b972EmWT+a63wtTAwfGF3QpYZkzwov5UPh=JPmgJY=dW8A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 20, 2024 at 3:56 PM Amul Sul <sulamul(at)gmail(dot)com> wrote:
>
> On Sat, Aug 17, 2024 at 1:34 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >
> > On Fri, Aug 16, 2024 at 3:53 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
[...]
> > There's probably more to look at here but I'm running out of energy for today.
> >
>
> Thank you for the review and committing 0004 and 0006 patches.
>

I have reworked a few comments, revised error messages, and made some
minor tweaks in the attached version.

Additionally, I would like to discuss a couple of concerns regarding
error placement and function names to gather your suggestions.

0007 patch: Regarding error placement:

1. I'm a bit unsure about the (bytes_read != m->size) check that I
placed in verify_checksum() and whether it's in the right place. Per
our previous discussion, this check is applicable to plain backup
files since they can change while being read, but not for files
belonging to tar backups. For consistency, I included the check for
tar backups as well, as it doesn't cause any harm. Is it okay to keep
this check in verify_checksum(), or should I move it back to
verify_file_checksum() and apply it only to the plain backup format?

2. For the verify_checksum() function, I kept the argument name as
bytes_read. Should we rename it to something more meaningful like
computed_bytes, computed_size, or checksum_computed_size?

0011 patch: Regarding function names:

1. named the function verify_tar_backup_file() to align with
verify_plain_backup_file(), but it does not perform the complete
verification as verify_plain_backup_file does. Not sure if it is the
right name.

2. verify_tar_file_contents() is the second and final part of tar
backup verification. Should its name be aligned with
verify_tar_backup_file()? I’m unsure what the best name would be.
Perhaps verify_tar_backup_file_final(), but then
verify_tar_backup_file() would need to be renamed to something like
verify_tar_backup_file_initial(), which might be too lengthy.

3. verify_tar_contents() is the core of verify_tar_file_contents()
that handles the actual verification. I’m unsure about the current
naming. Should we rename it to something like
verify_tar_contents_core()? It wouldn’t be an issue if we renamed
verify_tar_file_contents() as pointed in point #2.

Regards,
Amul

Attachment Content-Type Size
v12-0006-Refactor-split-verify_backup_file-function-and-r.patch application/x-patch 6.3 KB
v12-0007-Refactor-split-verify_file_checksum-function.patch application/x-patch 3.2 KB
v12-0008-Refactor-split-verify_control_file.patch application/x-patch 5.4 KB
v12-0009-Add-simple_ptr_list_destroy-and-simple_ptr_list_.patch application/x-patch 2.2 KB
v12-0010-pg_verifybackup-Add-backup-format-and-compressio.patch application/x-patch 6.1 KB
v12-0011-pg_verifybackup-Read-tar-files-and-verify-its-co.patch application/x-patch 27.7 KB
v12-0012-pg_verifybackup-Tests-and-document.patch application/x-patch 13.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2024-08-21 11:29:58 Re: Add llvm version into the version string
Previous Message Dean Rasheed 2024-08-21 10:51:59 Re: Virtual generated columns