Quick Links

Re: Add -k/--link option to pg_combinebackup

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Israel Barth Rubio <barthisrael(at)gmail(dot)com>
Cc:	vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Add -k/--link option to pg_combinebackup
Date:	2025-03-04 17:28:30
Message-ID:	CA+TgmoawCPihXHb3ik2EsJ67xiwsqC-FWwonhLmCXkNOQ=m=OA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Mar 4, 2025 at 7:07 AM Israel Barth Rubio <barthisrael(at)gmail(dot)com> wrote:
> About the last one, related to the copy method, my first thought was to do
> something like you did (in the sense of using == instead of !=). However, I
> was concerned that I would change the previous behavior. But I do prefer
> how it stands in your suggestion, thanks!

I don't think it does, because I think those options are anyway
blocked on Windows. See the comment block that says /* Check that the
platform supports the requested copy method. */

> I'm attaching the v5 of the patch now.

So, this still fails for me, in a manner quite similar to before. The
first call to check_data_file() is for
/Users/robert.haas/pgsql-meson/testrun/pg_combinebackup/010_links/data/t_010_links_restore_data/pgdata/base/5/16384
but @data_file_segments ends up zero-length. Therefore, $last_segment
is undef when we access it at the bottom of check_data_file(). I think
the problem is here:

my $basename = (split /\./, $full_path)[0];

This code assumes that no path name component other than the filename
at the end contains a period, but in my case that's not true, because
my home directory on this particular computer is /Users/robert.haas.

But I'm wondering why you are even using readdir() here in the first
place. You could just test -f "16384"; if it exists test -f "16384.1";
if that exists test -f "16384.2" and keep going until you reach a file
that does not exist. Not only would this avoid reading the entire
directory contents for every call to this function, but it would also
generate the list of files in sorted order, which would let you throw
away natural_sort().

Overall, I feel like this could just be significantly simpler. For
instance, I don't quite see the need to have both test_1 and test_2.
If we just have one table and all the segments but the last have 2
hard links while the last have 1, isn't that enough? What test
coverage do we get by adding the second relation? All of the segments
of test_1 behave just like the non-final segments of test_2, so why
test both? If you go down to 1 table, you can simplify a bunch of
things here.

Why does the insert add 10 new rows instead of just 1? Doesn't that
introduce a significant risk that with some choice of segment size
those inserts will be spread across two segments rather than one,
leading to the test failing?

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Re: Add -k/--link option to pg_combinebackup at 2025-03-04 12:06:50 from Israel Barth Rubio

Responses

Re: Add -k/--link option to pg_combinebackup at 2025-03-05 14:47:33 from Israel Barth Rubio

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Nathan Bossart	2025-03-04 17:36:09	Re: Improve CRC32C performance on SSE4.2
Previous Message	Peter Geoghegan	2025-03-04 17:25:10	Re: Next commitfest app release is planned for March 18th