Re: Pgoutput not capturing the generated columns

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Shubham Khanna <khannashubham1197(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Rajendra Kumar Dangwal <dangwalrajendra888(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, euler(at)eulerto(dot)com
Subject: Re: Pgoutput not capturing the generated columns
Date: 2024-10-24 06:45:46
Message-ID: CALDaNm1GCmzzGV6oGOB9g+LOA1j-8N_EFR3YNN1KaOr+oKdqJA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 23 Oct 2024 at 11:51, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Oct 22, 2024 at 9:42 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Tue, Oct 22, 2024 at 3:50 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> >
> > > Considering this, I feel if find this behavior buggy then we should
> > > fix this separately rather than part of this patch. What do you think?
> >
> > Agreed. It's better to fix it separately.
> >
>
> Thanks. One more thing that I didn't like about the patch is that it
> used column_list to address the "publish_generated_columns = false"
> case such that we build column_list without generated columns for the
> same. The first problem is that it will add overhead to always probe
> column_list during proto.c calls (for example during
> logicalrep_write_attrs()), then it makes the column_list code complex
> especially the handling in pgoutput_column_list_init(), and finally
> this appears to be a misuse of column_list.

I simplified the process into three steps: a) Iterate through the
column list of publications and raise an error if there are any
discrepancies in the column list. b) Examine the non-column list of
publications to identify any conflicting options for
publish_generated_columns, and set this option accordingly. c)
Finally, verify that the columns in the column list align with the
table's columns based on the publish_generated_columns option.

> So, I suggest remembering this information in RelationSyncEntry and
> then using it at the required places. We discussed above that
> contradictory values of "publish_generated_columns" across
> publications for the same relations are not accepted, so we can detect
> that during get_rel_sync_entry() and give an ERROR for the same.

Resolved the issue by adding a new variable, pubgencols, to determine
whether the generated columns need to be published. This variable will
later be utilized in logicalrep_write_tuple and logicalrep_write_attrs
to replicate the generated columns to the subscriber if required.

> Additional comment on the 0003 patch
> +# =============================================================================
> +# Misc test.
> +#
> +# A "normal -> generated" replication.
> +#
> +# In this test case we use DROP EXPRESSION to change the subscriber generated
> +# column into a normal column, then verify replication works ok.
> +# =============================================================================
>
> In patch 0003, why do we have the above test? This doesn't seem to be
> directly related to this patch.

Removed this.

The attached v41 version patch has the changes for the same.

Regards,
Vignesh

Attachment Content-Type Size
v41-0003-Tap-tests-for-generated-columns.patch text/x-patch 11.1 KB
v41-0002-DOCS-Generated-Column-Replication.patch text/x-patch 13.3 KB
v41-0001-Enable-support-for-publish_generated_columns-opt.patch text/x-patch 111.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-10-24 06:47:46 Re: Pgoutput not capturing the generated columns
Previous Message Peter Smith 2024-10-24 06:44:44 Re: Refactor to use common function 'get_publications_str'.