Re: Test to dump and restore objects left behind by regression

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>, vignesh C <vignesh21(at)gmail(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Test to dump and restore objects left behind by regression
Date: 2025-03-31 11:37:15
Message-ID: CAExHW5v6QMe7R7M1SVDLLtUO+tueAxxwcqCunzUrJBMCUmcxww@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 28, 2025 at 11:43 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>
> On 2025-Mar-28, Tom Lane wrote:
>
> > I think instead of going this direction, we really need to create a
> > separately-purposed script that simply creates "one of everything"
> > without doing anything else (except maybe loading a little data).
> > I believe it'd be a lot easier to remember to add to that when
> > inventing new SQL than to remember to leave something behind from the
> > core regression tests. This would also be far faster to run than any
> > approach that involves picking a random subset of the core test
> > scripts.

It's easier to remember to do something or not do something in the
same file than in some other file. I find it hard to believe that
introducing another set of SQL files somewhere far from regress would
make this problem easier.

The number of states in which objects can be left behind in the
regress/sql is very large - and maintaining that 1:1 in some other set
of scripts is impossible unless it's automated.

> FWIW this sounds closely related to what I tried to do with
> src/test/modules/test_ddl_deparse; it's currently incomplete, but maybe
> we can use that as a starting point.

create_table.sql in test_ddl_deparse has only one statement creating
an inheritance table whereas there are dozens of different states of
parent/child tables created by regress. It will require a lot of work
to bridge the gap between regress_ddl_deparse and regress and more
work to maintain it.

I might be missing something in your ideas.

IMO, whatever we do it should rely on a single set of files. One
possible way could be to break the existing files into three files
each, containing DDL, DML and queries from those files respectively
and create three schedules DDL, DML and queries containing the
respective files. These schedules will be run as required. Standard
regression run runs all the three schedules one by one. But
002_pg_upgrade will run DDL and DML on the source database and run
queries on target - thus checking sanity of the dump/restore or
pg_upgrade beyond just the dump comparison. 027_stream_regress might
run DDL, DML on the source server and queries on the target.

But that too is easier said than done for:
1. Our tests mix all three kinds of statements and also rely on the
order in which they are run. It will require some significant effort
to carefully separate the statements.
2. With the new set of files backpatching would become hard.

--
Best Wishes,
Ashutosh Bapat

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2025-03-31 11:43:22 Re: per backend WAL statistics
Previous Message Zhijie Hou (Fujitsu) 2025-03-31 11:34:15 RE: Fix slot synchronization with two_phase decoding enabled