Re: Test to dump and restore objects left behind by regression

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Peter Eisentraut <peter(at)eisentraut(dot)org>, Daniel Gustafsson <daniel(at)yesql(dot)se>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Test to dump and restore objects left behind by regression
Date: 2024-06-05 11:39:58
Message-ID: CAExHW5snZ+y47BwH89EhhDc7ZtWN4VD0iWrODbvA6q2XQitqvA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 4, 2024 at 4:28 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:

> On Fri, Apr 26, 2024 at 06:38:22PM +0530, Ashutosh Bapat wrote:
> > Some points for discussion:
> > 1. The test still hardcodes the diffs between two dumps. Haven't found a
> > better way to do it. I did consider removing the problematic objects from
> > the regression database but thought against it since we would lose some
> > coverage.
> >
> > 2. The new code tests dump and restore of just the regression database
> and
> > does not use pg_dumpall like pg_upgrade. Should it instead perform
> > pg_dumpall? I decided against it since a. we are interested in dumping
> and
> > restoring objects left behind by regression, b. I didn't find a way to
> > provide the format option to pg_dumpall. The test could be enhanced to
> use
> > different dump formats.
> >
> > I have added it to the next commitfest.
> > https://commitfest.postgresql.org/48/4956/
>
> Ashutosh and I have discussed this patch a bit last week. Here is a
> short summary of my input, after I understood what is going on.
>
> + # We could avoid this by dumping the database loaded from original
> dump.
> + # But that would change the state of the objects as left behind by
> the
> + # regression.
> + my $expected_diff = " --
> + CREATE TABLE public.gtestxx_4 (
> +- b integer,
> +- a integer NOT NULL
> ++ a integer NOT NULL,
> ++ b integer
> + )
> [...]
> + my ($stdout, $stderr) =
> + run_command([ 'diff', '-u', $dump4_file, $dump5_file]);
> + # Clear file names, line numbers from the diffs; those are not
> going to
> + # remain the same always. Also clear empty lines and normalize new
> line
> + # characters across platforms.
> + $stdout =~ s/^\(at)\@.*$//mg;
> + $stdout =~ s/^.*$dump4_file.*$//mg;
> + $stdout =~ s/^.*$dump5_file.*$//mg;
> + $stdout =~ s/^\s*\n//mg;
> + $stdout =~ s/\r\n/\n/g;
> + $expected_diff =~ s/\r\n/\n/g;
> + is($stdout, $expected_diff, 'old and new dumps match after dump
> and restore');
> +}
>
> I am not a fan of what this patch does, adding the knowledge related
> to the dump filtering within 002_pg_upgrade.pl. Please do not take
> me wrong, I am not against the idea of adding that within this
> pg_upgrade test to save from one full cycle of `make check` to check
> the consistency of the dump. My issue is that this logic should be
> externalized, and it should be in fewer lines of code.

> For the externalization part, Ashutosh and I considered a few ideas,
> but one that we found tempting is to create a small .pm, say named
> AdjustDump.pm. This would share some rules with the existing
> AdjustUpgrade.pm, which would be fine IMO even if there is a small
> overlap, documenting the dependency between each module. That makes
> the integration with the buildfarm much simpler by not creating more
> dependencies with the modules shared between core and the buildfarm
> code. For the "shorter" part, one idea that I had is to apply to the
> dump a regexp that wipes out the column definitions within the
> parenthesis, keeping around the CREATE TABLE and any other attributes
> not impacted by the reordering. All that should be documented in the
> module, of course.
>

Thanks for the suggestion. I didn't understand the dependency with the
buildfarm module. Will the new module be used in buildfarm separately? I
will work on this soon. Thanks for changing CF entry to WoA.

>
> Another thing would be to improve the backend so as we are able to
> a better support for physical column ordering, which would, I assume
> (and correct me if I'm wrong!), prevent the reordering of the
> attributes like in this inheritance case. But that would not address
> the case of dumps taken from older versions with a new version of
> pg_dump, which is something that may be interesting to have more tests
> for in the long-term. Overall a module sounds like a better solution.
>

Changing the physical order of column of a child table based on the
inherited table seems intentional as per MergeAttributes(). That logic
looks sane by itself. In binary mode pg_dump works very hard to retain the
column order by issuing UPDATE commands against catalog tables. I don't
think mimicking that behaviour is the right choice for non-binary dump. I
agree with your conclusion that we fix it in by fixing the diffs. The code
to do that will be part of a separate module.

--
Best Wishes,
Ashutosh Bapat

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jelte Fennema-Nio 2024-06-05 11:52:59 libpq: Trace StartupMessage/SSLRequest/GSSENCRequest correctly
Previous Message Ashutosh Bapat 2024-06-05 11:26:10 Re: meson and check-tests