Re: Add -k/--link option to pg_combinebackup

From: Israel Barth Rubio <barthisrael(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Add -k/--link option to pg_combinebackup
Date: 2025-03-05 14:47:33
Message-ID: CAO_rXXCP7sVoDqTjo3O15nqPs32JvrBNPiQ_QuJOuADUS64t9Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> I don't think it does, because I think those options are anyway
> blocked on Windows. See the comment block that says /* Check that the
> platform supports the requested copy method. */

Yeah, you are right. I could swear I ran `pg_combinebackup --clone` on
a Windows VM, and instead of erroring out it instead ran with CopyFile.
But that's not the case, it errors out (as expected).

> So, this still fails for me, in a manner quite similar to before. The
> first call to check_data_file() is for
>
/Users/robert.haas/pgsql-meson/testrun/pg_combinebackup/010_links/data/t_010_links_restore_data/pgdata/base/5/16384
> but @data_file_segments ends up zero-length. Therefore, $last_segment
> is undef when we access it at the bottom of check_data_file(). I think
> the problem is here:
>
> my $basename = (split /\./, $full_path)[0];
>
> This code assumes that no path name component other than the filename
> at the end contains a period, but in my case that's not true, because
> my home directory on this particular computer is /Users/robert.haas.

Right, well spotted. The test code had bad assumptions. Those
assumptions should be in the file name, not in the full path, thus that
broke in your env.

> But I'm wondering why you are even using readdir() here in the first
> place. You could just test -f "16384"; if it exists test -f "16384.1";
> if that exists test -f "16384.2" and keep going until you reach a file
> that does not exist. Not only would this avoid reading the entire
> directory contents for every call to this function, but it would also
> generate the list of files in sorted order, which would let you throw
> away natural_sort().

That makes total sense. In the v6 version I threw out the natural_sort
and started using the approach you described, which is much more
efficient and less error prone.

> Overall, I feel like this could just be significantly simpler. For
> instance, I don't quite see the need to have both test_1 and test_2.
> If we just have one table and all the segments but the last have 2
> hard links while the last have 1, isn't that enough? What test
> coverage do we get by adding the second relation? All of the segments
> of test_1 behave just like the non-final segments of test_2, so why
> test both? If you go down to 1 table, you can simplify a bunch of
> things here.

I think that isn't enough because it makes assumptions on the
environment -- in this case that assumes the Linux autoconf environment
from Cirrus CI.

The test is creating a couple of tables with ~264KB. If you have a very
small segment size, like we have in the autoconf job of Cirrus CI, then
that's enough to create more than a single file, in such a way we are
able to verify segments with 2 or 1 hard links.

However, when running through other jobs, like Windows meson, it uses
the default segment size of 1GB, and in that case we would have a single
segment with 1 hard link if we had a single table.

With that in mind, and taking into consideration that local builds from
devs might use the default segment size, having a couple of tables to
test different purposes, i.e. test_1 for testing an "unmodified table"
scenario, and test_2 for testing a "modified table" scenario, looks more
appropriate to me.

For now I'm keeping the patch as it was in that sense, i.e. with a
check_data_file function that is called both for test_1 and test_2. I
added a comment about that in the test file. I can send a new patch
version if that still sounds like an overkill (having two test tables).

> Why does the insert add 10 new rows instead of just 1? Doesn't that
> introduce a significant risk that with some choice of segment size
> those inserts will be spread across two segments rather than one,
> leading to the test failing?

That was a leftover from some manual attempts on my side before sending
the patch, sorry! I started with a simple INSERT, then tried a INSERT
with a SELECT on a range, and missed reverting before sending the patch.
The v6 version of the patch contains the simple INSERT version, which
adds only one tuple to the test_2 table, and is safer as your pointed
out.

Best regards,
Israel.

Attachment Content-Type Size
v6-0001-pg_combinebackup-add-support-for-hard-links.patch application/octet-stream 14.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-03-05 14:52:46 Re: should num_custom_plans be reset after plan invalidation?
Previous Message Andrei Lepikhov 2025-03-05 14:40:22 Re: Hook for Selectivity Estimation in Query Planning