Re: pg_verifybackup: TAR format backup verification

From: Amul Sul <sulamul(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Sravan Kumar <sravanvcybage(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: pg_verifybackup: TAR format backup verification
Date: 2024-08-07 17:05:00
Message-ID: CAAJ_b95rXAsyuetre9Wc-W6uHW-=ivVj56F8-w4gfrTGsnRRSA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 7, 2024 at 9:12 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> [ I committed 0001, then noticed I had a type in the subject line of
> the commit message. Argh. ]
>
> On Wed, Aug 7, 2024 at 9:41 AM Amul Sul <sulamul(at)gmail(dot)com> wrote:
> > With the patch, I am concerned that we won't be able to give an
> > accurate progress report as before. We add all the file sizes in the
> > backup manifest to the total_size without checking if they exist on
> > disk. Therefore, sometimes the reported progress completion might not
> > show 100% when we encounter files where m->bad or m->match == false at
> > a later stage. However, I think this should be acceptable since there
> > will be an error for the respective missing or bad file, and it can be
> > understood that verification is complete even if the progress isn't
> > 100% in that case. Thoughts/Comments?
>
> When somebody says that something is a refactoring commit, my
> assumption is that there should be no behavior change. If the behavior
> is changing, it's not purely a refactoring, and it shouldn't be
> labelled as a refactoring (or at least there should be a prominent
> disclaimer identifying whatever behavior has changed, if a small
> change was deemed acceptable and unavoidable).
>

I agree; I'll be more careful next time.

> I am very reluctant to accept a functional regression of the type that
> you describe here (or the type that I postulated might occur, even if
> I was wrong and it doesn't). The point here is that we're trying to
> reuse the code, and I support that goal, because code reuse is good.
> But it's not such a good thing that we should do it if it has negative
> consequences. We should either figure out some other way of
> refactoring it that doesn't have those negative side-effects, or we
> should leave the existing code alone and have separate code for the
> new stuff we want to do.
>
> I do realize that the type of side effect you describe here is quite
> minor. I could live with it if it were unavoidable. But I really don't
> see why we can't avoid it.
>

The main issue I have is computing the total_size of valid files that
will be checksummed and that exist in both the manifests and the
backup, in the case of a tar backup. This cannot be done in the same
way as with a plain backup.

Another consideration is that, in the case of a tar backup, since
we're reading all tar files from the backup rather than individual
files to be checksummed, should we consider total_size as the sum of
all valid tar files in the backup, regardless of checksum
verification? If so, we would need to perform an initial pass to
calculate the total_size in the directory, similar to what
verify_backup_directory() does., but doing so I am a bit uncomfortable
and unsure if we should do that pass.

An alternative idea is to provide progress reports per file instead of
for the entire backup directory. We could report the size of each file
and keep track of done_size as we read each tar header and content.
For example, the report could be:

109032/109032 kB (100%) base.tar file verified
123444/123444 kB (100%) 16245.tar file verified
23478/23478 kB (100%) 16246.tar file verified
.....
<total_done_size>/<total_size> (NNN%) verified.

I think this type of reporting can be implemented with minimal
changes, abd for plain backups, we can keep the reporting as it is.
Thoughts?

Regards,
Amul

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2024-08-07 17:06:32 Re: Add LSN <-> time conversion functionality
Previous Message Melanie Plageman 2024-08-07 16:30:53 Re: Add LSN <-> time conversion functionality