Re: pg_verifybackup: TAR format backup verification

From: Amul Sul <sulamul(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Sravan Kumar <sravanvcybage(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_verifybackup: TAR format backup verification
Date: 2024-07-31 13:28:16
Message-ID: CAAJ_b96Gwh=6zwpqsD-sVNOeYcU=9rfBygVtuXmU3qFAiNr2Hw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 30, 2024 at 9:04 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Mon, Jul 22, 2024 at 7:53 AM Amul Sul <sulamul(at)gmail(dot)com> wrote:
> > Fix in the attached version.
>
> First of all, in the interest of full disclosure, I suggested this
> project to Amul, so I'm +1 on the concept. I think making more of our
> backup-related tools work with tar and compressed tar formats -- and
> perhaps eventually data not stored locally -- will make them a lot
> more usable. If, for example, you take a full backup and an
> incremental backup, each in tar format, store them in the cloud
> someplace, and then want to verify and afterwards restore the
> incremental backup, you would need to download the tar files from the
> cloud, then extract all the tar files, then run pg_verifybackup and
> pg_combinebackup over the results. With this patch set, and similar
> work for pg_combinebackup, you could skip the step where you need to
> extract the tar files, saving significant amounts of time and disk
> space. If the tools also had the ability to access data remotely, you
> could save even more, but that's a much harder project, so it makes
> sense to me to start with this.
>
> Second, I think this patch set is quite well-organized and easy to
> read. That's not to say there is nothing in these patches to which
> someone might object, but it seems to me that it should at least be
> simple for anyone who wants to review to find the things to which they
> object in the patch set without having to spend too much time on it,
> which is fantastic.
>
> Third, I think the general approach that these patches take to the
> problem - namely, renaming bbstreamer to astreamer and moving it
> somewhere that permits it to be reused - makes a lot of sense. To be
> honest, I don't think I had it in mind that bbstreamer would be a
> reusable component when I wrote it, or if I did have it in mind, it
> was off in some dusty corner of my mind that doesn't get visited very
> often. I was imagining that you would need to build new infrastructure
> to deal with reading the tar file, but I think what you've done here
> is better. Reusing the bbstreamer stuff gives you tar file parsing,
> and decompression if necessary, basically for free, and IMHO the
> result looks rather elegant.
>

Thank you so much for the summary and the review.

> However, I'm not very convinced by 0003. The handling between the
> meson and make-based build systems doesn't seem consistent. On the
> meson side, you just add the objects to the same list that contains
> all of the other files (but not in alphabetical order, which should be
> fixed). But on the make side, you for some reason invent a separate
> AOBJS list instead of just adding the files to OBJS. I don't think it
> should be necessary to treat these objects any differently from any
> other objects, so they should be able to just go in OBJS: but if it
> were necessary, then I feel like the meson side would need something
> similar.
>

Fixed -- I did that because it was part of a separate group in pg_basebackup.

> Also, I'm not so sure about this change to src/fe_utils/meson.build:
>
> - dependencies: frontend_common_code,
> + dependencies: [frontend_common_code, lz4, zlib, zstd],
>
> frontend_common_code already includes dependencies on zlib and zstd,
> so we probably don't need to add those again here. I checked the
> result of otool -L src/bin/pg_controldata/pg_controldata from the
> meson build directory, and I find that currently it links against libz
> and libzstd but not liblz4. However, if I either make this line say
> dependencies: [frontend_common_code, lz4] or if I just update
> frontend_common_code to include lz4, then it starts linking against
> liblz4 as well. I'm not entirely sure if there's any reason to do one
> or the other of those things, but I think I'd be inclined to make
> frontend_common_code just include lz4 since it already includes zlib
> and zstd anyway, and then you don't need this change.
>

Fixed -- frontend_common_code now includes lz4 as well.

> Alternatively, we could take the position that we need to avoid having
> random front-end tools that don't do anything with compression at all,
> like pg_controldata for example, to link with compression libraries at
> all. But then we'd need to rethink a bunch of things that have not
> much to do with this patch.
>

Noted. I might give it a try another day, unless someone else beats
me, perhaps in a separate thread.

> Regarding 0004, I would rather not move show_progress and
> skip_checksums to the new header file. I suppose I was a bit lazy in
> making these file-level global variables instead of passing them down
> using function arguments and/or a context object, but at least right
> now they're only global within a single file. Can we think of
> inserting a preparatory patch that moves these into verifier_context?
>

Done -- added a new patch as 0004, and the subsequent patch numbers
have been incremented accordingly.

> Regarding 0005, the comment /* Check whether there's an entry in the
> manifest hash. */ should move inside verify_manifest_entry, where
> manifest_files_lookup is called. The call to the new function
> verify_manifest_entry() needs its own, proper comment. Also, I think
> there's a null-pointer deference hazard here, because
> verify_manifest_entry() can return NULL but the "Validate the manifest
> system identifier" chunk assumes it isn't. I think you could hit this
> - and presumably seg fault - if pg_control is on disk but not in the
> manifest. Seems like just adding an m != NULL test is easiest, but
> see also below comments about 0006.
>

Fixed -- I did the NULL check in the earlier 0007 patch, but it should
have been done in this patch.

> Regarding 0006, suppose that the member file within the tar archive is
> longer than expected. With the unpatched code, we'll feed all of the
> data to the checksum context, but then, after the read-loop
> terminates, we'll complain about the file being the wrong length. With
> the patched code, we'll complain about the checksum mismatch before
> returning from verify_content_checksum(). I think that's an unintended
> behavior change, and I think the new behavior is worse than the old
> behavior. But also, I think that in the case of a tar file, the
> desired behavior is quite different. In that case, we know the length
> of the file from the member header, so we can check whether the length
> is as expected before we read any of the data bytes. If we discover
> that the size is wrong, we can complain about that and need not feed
> the checksum bytes to the checksum context at all -- we can just skip
> them, which will be faster. That approach doesn't really make sense
> for a file, because even if we were to stat() the file before we
> started reading it, the length could theoretically change as we are
> reading it, if someone is concurrently modifying it, but for a tar
> file I think it does.
>

In the case of a file size mismatch, we never reach the point where
checksum calculation is performed, because verify_manifest_entry()
encounters an error and sets manifest_file->bad to true, which causes
skip_checksum to be set to false. For that reason, I didn’t include
the size check again in the checksum calculation part. This behavior
is the same for plain backups, but the additional file size check was
added as a precaution (per comment in verify_file_checksum()),
possibly for the same reasons you mentioned.

I agree, changing the order of errors could create confusion.
Previously, a file size mismatch was a clear and appropriate error
that was reported before the checksum failure error.

However, this can be fixed by delaying the checksum calculation until
the expected file content size is received. Specifically, return from
verify_content_checksum(), if (*computed_len != m->size). If the file
size is incorrect, the checksum calculation won't be performed, and
the caller's loop reading file (I mean in verify_file_checksum()) will
exit at some point which later encounters the size mismatch error.

> I would suggest abandoning this refactoring. There's very little logic
> in verify_file_checksum() that you can actually reuse. I think you
> should just duplicate the code. If you really want, you could arrange
> to reuse the error-reporting code that checks for checksumlen !=
> m->checksum_length and memcmp(checksumbuf, m->checksum_payload,
> checksumlen) != 0, but even that I think is little enough that it's
> fine to just duplicate it. The rest is either (1) OS calls like
> open(), read(), etc. which won't be applicable to the
> read-from-archive case or (2) calls to pg_checksum_WHATEVER, which are
> fine to just duplicate, IMHO.
>

I kept the refactoring as it is by fixing verify_content_checksum() as
mentioned in the previous paragraph. Please let me know if this fix
and the explanation makes sense to you. I’m okay with abandoning this
refactor patch if you think.

> My eyes are starting to glaze over a bit here so expect comments below
> this point to be only a partial review of the corresponding patch.
>
> Regarding 0007, I think that the should_verify_sysid terminology is
> problematic. I made all the code and identifier names talk only about
> the control file, not the specific thing in the control file that we
> are going to verify, in case in the future we want to verify
> additional things. This breaks that abstraction.
>

Agreed, changed to should_verify_control_data.

> Regarding 0009, I feel like astreamer_verify_content() might want to
> grow some subroutines. One idea could be to move the
> ASTREAMER_MEMBER_HEADER case and likewise ASTREAMER_MEMBER_CONTENTS
> cases into a new function for each; another idea could be to move
> smaller chunks of logic, e.g. under the ASTREAMER_MEMBER_CONTENTS
> case, the verify_checksums could be one subroutine and the ill-named
> verify_sysid stuff could be another. I'm not certain exactly what's
> best here, but some of this code is as deeply as six levels nested,
> which is not such a terrible thing that nobody should ever do it, but
> it is bad enough that we should at least look around for a better way.
>

Okay, I added the verify_checksums() and verify_controldata()
functions to the astreamer_verify.c file. I also updated related
variables that were clashing with these function names:
verify_checksums has been renamed to verifyChecksums, and verify_sysid
has been renamed to verifyControlData.

Thanks again for the review comments. Please have a look at the
attached version.

Regards,
Amul

Attachment Content-Type Size
v3-0011-pg_verifybackup-Tests-and-document.patch application/x-patch 13.7 KB
v3-0007-Refactor-split-verify_file_checksum-function.patch application/x-patch 5.4 KB
v3-0009-pg_verifybackup-Add-backup-format-and-compression.patch application/x-patch 8.8 KB
v3-0008-Refactor-split-verify_control_file.patch application/x-patch 5.1 KB
v3-0010-pg_verifybackup-Read-tar-files-and-verify-its-con.patch application/x-patch 21.4 KB
v3-0006-Refactor-split-verify_backup_file-function.patch application/x-patch 2.8 KB
v3-0005-Refactor-move-some-part-of-pg_verifybackup.c-to-p.patch application/x-patch 8.3 KB
v3-0004-Refactor-move-show_progress-and-skip_checksums-to.patch application/x-patch 3.8 KB
v3-0003-Refactor-move-astreamer-files-to-fe_utils-to-make.patch application/x-patch 7.1 KB
v3-0002-Refactor-Add-astreamer_inject.h-and-move-related-.patch application/x-patch 3.0 KB
v3-0001-Refactor-Rename-all-bbstreamer-references-to-astr.patch application/x-patch 106.8 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Junwang Zhao 2024-07-31 13:30:56 Re: refactor the CopyOneRowTo
Previous Message Joe Conway 2024-07-31 13:14:37 Re: can we mark upper/lower/textlike functions leakproof?