From: | Amul Sul <sulamul(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Sravan Kumar <sravanvcybage(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de> |
Subject: | Re: pg_verifybackup: TAR format backup verification |
Date: | 2024-08-12 09:12:24 |
Message-ID: | CAAJ_b95mcGjkfAf1qduOR97CokW8-_i-dWLm3v6x1w2-OW9M+A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Aug 7, 2024 at 11:28 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Wed, Aug 7, 2024 at 1:05 PM Amul Sul <sulamul(at)gmail(dot)com> wrote:
> > The main issue I have is computing the total_size of valid files that
> > will be checksummed and that exist in both the manifests and the
> > backup, in the case of a tar backup. This cannot be done in the same
> > way as with a plain backup.
>
> I think you should compute and sum the sizes of the tar files
> themselves. Suppose you readdir(), make a list of files that look
> relevant, and stat() each one. total_size is the sum of the file
> sizes. Then you work your way through the list of files and read each
> one. done_size is the total size of all files you've read completely
> plus the number of bytes you've read from the current file so far.
>
I tried this in the attached version and made a few additional changes
based on Sravan's off-list comments regarding function names and
descriptions.
Now, verification happens in two passes. The first pass simply
verifies the file names, determines their compression types, and
returns a list of valid tar files whose contents need to be verified
in the second pass. The second pass is called at the end of
verify_backup_directory() after all files in that directory have been
scanned. I named the functions for pass 1 and pass 2 as
verify_tar_file_name() and verify_tar_file_contents(), respectively.
The rest of the code flow is similar as in the previous version.
In the attached patch set, I abandoned the changes, touching the
progress reporting code of plain backups by dropping the previous 0009
patch. The new 0009 patch adds missing APIs to simple_list.c to
destroy SimplePtrList. The rest of the patch numbers remain unchanged.
Regards,
Amul
Attachment | Content-Type | Size |
---|---|---|
v9-0011-pg_verifybackup-Read-tar-files-and-verify-its-con.patch | application/x-patch | 29.0 KB |
v9-0008-Refactor-split-verify_control_file.patch | application/x-patch | 5.7 KB |
v9-0012-pg_verifybackup-Tests-and-document.patch | application/x-patch | 12.5 KB |
v9-0010-pg_verifybackup-Add-backup-format-and-compression.patch | application/x-patch | 6.2 KB |
v9-0009-Add-simple_ptr_list_destroy-and-simple_ptr_list_d.patch | application/x-patch | 2.2 KB |
v9-0005-Refactor-move-some-part-of-pg_verifybackup.c-to-p.patch | application/x-patch | 7.8 KB |
v9-0007-Refactor-split-verify_file_checksum-function.patch | application/x-patch | 2.9 KB |
v9-0004-Refactor-move-skip_checksums-global-variable-to-v.patch | application/x-patch | 1.9 KB |
v9-0006-Refactor-split-verify_backup_file-function.patch | application/x-patch | 4.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Melih Mutlu | 2024-08-12 09:19:30 | Re: Do we still need parent column in pg_backend_memory_context? |
Previous Message | Ashutosh Bapat | 2024-08-12 09:10:58 | Re: A problem about partitionwise join |