Re: Pgoutput not capturing the generated columns

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Shubham Khanna <khannashubham1197(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Rajendra Kumar Dangwal <dangwalrajendra888(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, euler(at)eulerto(dot)com
Subject: Re: Pgoutput not capturing the generated columns
Date: 2024-10-24 06:47:46
Message-ID: CAA4eK1LC3NoU0W8LEAGK3ZeQVkt29dqRkVmBhbfA28x+GZNfsA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 23, 2024 at 11:51 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> Thanks. One more thing that I didn't like about the patch is that it
> used column_list to address the "publish_generated_columns = false"
> case such that we build column_list without generated columns for the
> same. The first problem is that it will add overhead to always probe
> column_list during proto.c calls (for example during
> logicalrep_write_attrs()), then it makes the column_list code complex
> especially the handling in pgoutput_column_list_init(), and finally
> this appears to be a misuse of column_list.
>
> So, I suggest remembering this information in RelationSyncEntry and
> then using it at the required places. We discussed above that
> contradictory values of "publish_generated_columns" across
> publications for the same relations are not accepted, so we can detect
> that during get_rel_sync_entry() and give an ERROR for the same.
>

The changes in tablesync look complicated and I am not sure whether it
handles the conflicting publish_generated_columns option correctly. I
have few thoughts for the same.
* The handling of contradictory options in multiple publications needs
to be the same as for column lists. I think it is handled (a) during
subscription creation, (b) during copy in fetch_remote_table_info(),
and (c) during replication. See Peter's email
(https://www.postgresql.org/message-id/CAHut%2BPs985rc95cB2x5yMF56p6Lf192AmCJOpAtK_%2BC5YGUF2A%40mail.gmail.com)
to understand why this new option has to be handled in the same way as
the column list.

* While fetching column list via pg_get_publication_tables(), we
should detect contradictory publish_generated_columns options similar
to column lists, and then after we get publish_generated_columns as
return value, we can even use that while fetching attribute
information.

A few additional comments:
1.
- /* Regular table with no row filter */
- if (lrel.relkind == RELKIND_RELATION && qual == NIL)
+ /*
+ * Check if the remote table has any generated columns that should be
+ * copied.
+ */
+ for (int i = 0; i < relmapentry->remoterel.natts; i++)
+ {
+ if (lrel.attremotegen[i])
+ {
+ gencol_copy_needed = true;
+ break;
+ }
+ }

Can't we get this information from fetch_remote_table_info() instead
of traversing the entire column list again?

2.
@@ -1015,7 +1110,6 @@ fetch_remote_table_info(char *nspname, char *relname,
ExecDropSingleTupleTableSlot(slot);

lrel->natts = natt;
-
walrcv_clear_result(res);

Spurious line removal.

3. Why do we have to specifically exclude generated columns of a
subscriber-side table in make_copy_attnamelist()? Can't we rely on
fetch_remote_table_info() and logicalrep_rel_open() that the final
remote attrlist will contain the generated column only if the
subscriber doesn't have a generated column otherwise it would have
given an error in logicalrep_rel_open()?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tender Wang 2024-10-24 07:01:07 Re: Wrong result when enable_partitionwise_join is on if collation of PartitionKey and Column is different.
Previous Message vignesh C 2024-10-24 06:45:46 Re: Pgoutput not capturing the generated columns