Re: Pgoutput not capturing the generated columns

From: Shubham Khanna <khannashubham1197(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Rajendra Kumar Dangwal <dangwalrajendra888(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, euler(at)eulerto(dot)com
Subject: Re: Pgoutput not capturing the generated columns
Date: 2024-10-16 18:03:35
Message-ID: CAHv8RjKOvqs8Ouodcd+jBpgGinVffqqZxAVzzbZBrt4GkDcR7w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 10, 2024 at 10:53 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Here are some comments for TAP test patch v37-0003.
>
> I’m not in favour of the removal of such a large number of
> 'combination' and other 'misc' tests. In the commit message, please
> delete me as a "co-author" of this patch.
>
> ======
>
> 1.
> Any description or comment that still mentions "all combinations" is
> no longer valid:
>
> (e.g. in the comment message)
> Add tests for all combinations of generated column replication.
>
> (e.g. in the test file)
> # The following test cases exercise logical replication for all combinations
> # where there is a generated column on one or both sides of pub/sub:
>
> and
>
> # Furthermore, all combinations are tested using:
>
> ======
> 2.
> +# --------------------------------------------------
> +# Testcase: generated -> normal
> +# Publisher table has generated column 'b'.
> +# Subscriber table has normal column 'b'.
> +# --------------------------------------------------
> +
>
> Now that COPY for generated columns is already implemented in patch
> 0001, shouldn't this test be using 'copy_data' enabled, so it can test
> replication both for initial tablesync as well as normal replication?
>
> That was the whole point of having the "# XXX copy_data=false for now.
> This will be changed later." reminder comment in this file.
>
> ======
>
> 3.
> Previously there were some misc tests to ensure that a generated
> column which was then altered using DROP EXPRESSION would work as
> expected. The test scenario was commented like:
>
> +# =============================================================================
> +# Misc test.
> +#
> +# A "normal -> generated" replication fails, reporting an error that the
> +# subscriber side column is missing.
> +#
> +# In this test case we use DROP EXPRESSION to change the subscriber generated
> +# column into a normal column, then verify replication works ok.
> +# =============================================================================
>
> Now in patch v37 this test no longer exists. Why?
>
> ======
> 4.
> +# =============================================================================
> +# The following test cases demonstrate behavior of generated column replication
> +# when publish_generated_colums=false/true:
> +#
> +# Test: column list includes gencols, when publish_generated_columns=false
> +# Test: column list does not include gencols, when
> publish_generated_columns=false
> +#
> +# Test: column list includes gencols, when publish_generated_columns=true
> +# Test: column list does not include gencols, when
> publish_generated_columns=true
> +# Test: no column list, when publish_generated_columns=true
> +# =============================================================================
>
> These tests are currently only testing the initial tablesync
> replication. Since the COPY logic is different from the normal
> replication logic, I think it would be better to test some normal
> replication records as well, to make sure both parts work
> consistently. This comment applies to all of the following test cases.
>
> ~~~
>
> 5.
> +# Create table and publications.
> +$node_publisher->safe_psql(
> + 'postgres', qq(
> + CREATE TABLE nogen_to_gen3 (a int, b int, gen1 int GENERATED ALWAYS
> AS (a * 2) STORED, gen2 int GENERATED ALWAYS AS (a * 2) STORED);
> + CREATE TABLE nogen_to_gen4 (c int, d int, gen1 int GENERATED ALWAYS
> AS (c * 2) STORED, gen2 int GENERATED ALWAYS AS (c * 2) STORED);
> + INSERT INTO nogen_to_gen3 VALUES (1, 1);
> + INSERT INTO nogen_to_gen4 VALUES (1, 1);
> + CREATE PUBLICATION pub1 FOR table nogen_to_gen3, nogen_to_gen4(gen1)
> WITH (publish_generated_columns=true);
> +));
> +
>
> 5a.
> The code should do only what the comments say it does. So, the INSERTS
> should be done separately after the CREATE PUBLICATION, but before the
> CREATE SUBSCRIPTION. A similar change should be made for all of these
> test cases.
>
> # Insert some initial data
> INSERT INTO nogen_to_gen3 VALUES (1, 1);
> INSERT INTO nogen_to_gen4 VALUES (1, 1);
>
> ~
>
> 5b.
> The tables are badly named. Why are they 'nogen_to_gen', when the
> publisher side has generated cols and the subscriber side does not?
> This problem seems repeated in multiple subsequent test cases.
>
> ~
>
> 6.
> +$result = $node_subscriber->safe_psql('postgres',
> + "SELECT * FROM gen_to_nogen ORDER BY a");
> +is($result, qq(1|1||2),
> + 'gen_to_nogen initial sync, when publish_generated_columns=false');
> +
> +$result = $node_subscriber->safe_psql('postgres',
> + "SELECT * FROM gen_to_nogen2 ORDER BY c");
> +is($result, qq(1|1||),
> + 'gen_to_nogen2 initial sync, when publish_generated_columns=false');
>
> IMO all the "result" queries like these ones ought to have to have a
> comment which explains the reason for the expected results. This
> review comment applies to multiple places. Please add comments to all
> of them.
>
> ~~~
>
> 7.
> +# --------------------------------------------------
> +# Testcase: Publisher replicates the column list data excluding generated
> +# columns even though publish_generated_columns option is false.
> +# --------------------------------------------------
> +
>
> 7a.
> This is the 2nd test case, but AFAICT it would be far easier to test
> this scenario just by making another table (with an appropriate column
> list) for the 1st test case.
>
> ~
>
> 7b.
> BTW, I don't understand this test at all. I thought according to the
> comment that it intended to use a publication column list with only
> normal columns in it. But that is not what the publication looks like
> here:
> + CREATE PUBLICATION pub1 FOR table nogen_to_gen, nogen_to_gen2(gen1)
> WITH (publish_generated_columns=false);
>
> Indeed, the way it is currently written I didn't see what this test is
> doing that is any different from the prior test (???)
>
> ~~~
>
> 8.
> +# --------------------------------------------------
> +# Testcase: Although publish_generated_columns is true, publisher publishes
> +# only the data of the columns specified in column list, skipping other
> +# generated/non-generated columns.
> +# --------------------------------------------------
>
> versus
>
> +# --------------------------------------------------
> +# Testcase: Publisher publishes only the data of the columns specified in
> +# column list skipping other generated/non-generated columns.
> +# --------------------------------------------------
>
> Again, I did not understand how these test cases differ from each
> other. Surely, those can be combined easily enough just by adding
> another table with a different kind of column list.
>
> ~~~
>
> 9.
> +# --------------------------------------------------
> +# Testcase: Publisher replicates all columns if publish_generated_columns is
> +# enabled and there is no column list
> +# --------------------------------------------------
> +
>
> Here is yet another test case that AFAICT can just be combined with
> other test cases that were using publish_generated_columns=true. It
> seems all you need is one extra table with no column list. You don't
> need all the extra create/drop pub/sub overheads to test this.
>
> ======

I have fixed all the comments and posted the v39 patches for them.
Please refer to the updated v39 Patches here in [1]. See [1] for the
changes added.

[1] https://www.postgresql.org/message-id/CAHv8RjLjb%2B98i5ZQUphivxdOZ3hSGLfq2SiWQetUvk8zGyAQwQ%40mail.gmail.com

Thanks and Regards,
Shubham Khanna.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Shayon Mukherjee 2024-10-16 18:15:51 Re: Proposal to Enable/Disable Index using ALTER INDEX (with patch)
Previous Message Shubham Khanna 2024-10-16 18:02:51 Re: Pgoutput not capturing the generated columns