Re: Pgoutput not capturing the generated columns

From: Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: Shubham Khanna <khannashubham1197(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Rajendra Kumar Dangwal <dangwalrajendra888(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "euler(at)eulerto(dot)com" <euler(at)eulerto(dot)com>
Subject: Re: Pgoutput not capturing the generated columns
Date: 2024-07-04 10:31:31
Message-ID: CANhcyEWZH08g1gnxNxd-QeHESYniTmUdCULivS84qwKuK5zqTg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 25 Jun 2024 at 18:49, Hayato Kuroda (Fujitsu)
<kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
> Dear Shlok,
>
> Thanks for updating patches! Below are my comments, maybe only for 0002.
>
> 01. General
>
> IIUC, we are not discussed why ALTER SUBSCRIPTION ... SET include_generated_columns
> is prohibit. Previously, it seems okay because there are exclusive options. But now,
> such restrictions are gone. Do you have a reason in your mind? It is just not considered
> yet?
We donot support ALTER SUBSCRIPTION to alter
'include_generated_columns'. Suppose initially the user has a logical
replication setup. Publisher has
table t1 with columns (c1 int, c2 int generated always as (c1*2)) and
subscriber has table t1 with columns (c1 int, c2 int). And initially
'incude_generated_column' is true.
Now if we 'ALTER SUBSCRIPTION' to set 'include_generated_columns' as
false. Initial rows will have data for c2 on the subscriber table, but
will not have value after alter. This may be an inconsistent
behaviour.

> 02. General
>
> According to the doc, we allow to alter a column to non-generated one, by ALTER
> TABLE ... ALTER COLUMN ... DROP EXPRESSION command. Not sure, what should be
> when the command is executed on the subscriber while copying the data? Should
> we continue the copy or restart? How do you think?
COPY of data will happen in a single transaction, so if we execute
'ALTER TABLE ... ALTER COLUMN ... DROP EXPRESSION' command, It will
take place after the whole COPY command will finish. So I think it
will not create any issue.

> 03. Tes tcode
>
> IIUC, REFRESH PUBLICATION can also lead the table synchronization. Can you add
> a test for that?
Added

> 04. Test code (maybe for 0001)
>
> Please test the combination with TABLE ... ALTER COLUMN ... DROP EXPRESSION command.
Added

> 05. logicalrep_rel_open
>
> ```
> + /*
> + * In case 'include_generated_columns' is 'false', we should skip the
> + * check of missing attrs for generated columns.
> + * In case 'include_generated_columns' is 'true', we should check if
> + * corresponding column for the generated column in publication column
> + * list is present in the subscription table.
> + */
> + if (!MySubscription->includegencols && attr->attgenerated)
> + {
> + entry->attrmap->attnums[i] = -1;
> + continue;
> + }
> ```
>
> This comment is not very clear to me, because here we do not skip anything.
> Can you clarify the reason why attnums[i] is set to -1 and how will it be used?
This part of the code is removed to address some comments.

> 06. make_copy_attnamelist
>
> ```
> + gencollist = palloc0(MaxTupleAttributeNumber * sizeof(bool));
> ```
>
> I think this array is too large. Can we reduce a size to (desc->natts * sizeof(bool))?
> Also, the free'ing should be done.
I have changed the name 'gencollist' to 'localgenlist' to make the
name more consistent. Also
size should be (rel->remoterel.natts * sizeof(bool)) as I am storing
if a column is generated like 'localgenlist[attnum] = true;'
where 'attnum' is corresponding attribute number on publisher side.

> 07. make_copy_attnamelist
>
> ```
> + /* Loop to handle subscription table generated columns. */
> + for (int i = 0; i < desc->natts; i++)
> ```
>
> IIUC, the loop is needed to find generated columns on the subscriber side, right?
> Can you clarify as comment?
Fixed

> 08. copy_table
>
> ```
> + /*
> + * Regular table with no row filter and 'include_generated_columns'
> + * specified as 'false' during creation of subscription.
> + */
> ```
>
> I think this comment is not correct. After patching, all tablesync command becomes
> like COPY (SELECT ...) if include_genereted_columns is set to true. Is it right?
> Can we restrict only when the table has generated ones?
Fixed

Please refer to v14 patch for the changes [1].

[1]: https://www.postgresql.org/message-id/CANhcyEW95M_usF1OJDudeejs0L0%2BYOEb%3DdXmt_4Hs-70%3DCXa-g%40mail.gmail.com

Thanks and Regards,
Shlok Kyal

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2024-07-04 10:32:17 Re: Problem while installing PostgreSQL using make
Previous Message Shlok Kyal 2024-07-04 10:28:35 Re: Pgoutput not capturing the generated columns