Re: Ambiguous description on new columns

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: kdg(dot)dev(at)gmail(dot)com, pgsql-docs(at)lists(dot)postgresql(dot)org
Subject: Re: Ambiguous description on new columns
Date: 2024-05-29 09:55:52
Message-ID: CALDaNm0sA0M2_ic_dN8B9PiZhZE8Mbvo=oCdw_-=+=_0w43T-A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-hackers

On Wed, 22 May 2024 at 08:18, Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> On Tue, May 21, 2024 at 8:40 PM PG Doc comments form
> <noreply(at)postgresql(dot)org> wrote:
> >
> > The following documentation comment has been logged on the website:
> >
> > Page: https://www.postgresql.org/docs/16/logical-replication-col-lists.html
> > Description:
> >
> > The documentation on this page mentions:
> >
> > "If no column list is specified, any columns added later are automatically
> > replicated."
> >
> > It feels ambiguous what this could mean. Does it mean:
> >
> > 1/ That if you alter the table on the publisher and add a new column, it
> > will be replicated
> >
> > 2/ If you add a column list later and add a column to it, it will be
> > replicated
> >
> > In both cases, does the subscriber automatically create this column if it
> > wasn't there before?
>
> No, the subscriber will not automatically create the column. That is
> already clearly said at the top of the same page you linked "The table
> on the subscriber side must have at least all the columns that are
> published."
>
> All that "If no column list..." paragraph was trying to say is:
>
> CREATE PUBLICATION pub FOR TABLE T;
>
> is not quite the same as:
>
> CREATE PUBLICATION pub FOR TABLE T(a,b,c);
>
> The difference is, in the 1st case if you then ALTER the TABLE T to
> have a new column 'd' then that will automatically start replicating
> the 'd' data without having to do anything to either the PUBLICATION
> or the SUBSCRIPTION. Of course, if TABLE T at the subscriber side does
> not have a column 'd' then you'll get an error because your subscriber
> table needs to have *at least* all the replicated columns. (I
> demonstrate this error below)
>
> Whereas in the 2nd case, even though you ALTER'ed the TABLE T to have
> a new column 'd' then that won't be replicated because 'd' was not
> named in the PUBLICATION's column list.
>
> ~~~~
>
> Here's an example where you can see this in action
>
> Here is an example of the 1st case -- it shows 'd' is automatically
> replicated and also shows the subscriber-side error caused by the
> missing column:
>
> test_pub=# CREATE TABLE T(a int,b int, c int);
> test_pub=# CREATE PUBLICATION pub FOR TABLE T;
>
> test_sub=# CREATE TABLE T(a int,b int, c int);
> test_sub=# CREATE SUBSCRIPTION sub CONNECTION 'dbname=test_pub' PUBLICATION pub;
>
> See the replication happening
> test_pub=# INSERT INTO T VALUES (1,2,3);
> test_sub=# SELECT * FROM t;
> a | b | c
> ---+---+---
> 1 | 2 | 3
> (1 row)
>
> Now alter the publisher table T and insert some new data
> test_pub=# ALTER TABLE T ADD COLUMN d int;
> test_pub=# INSERT INTO T VALUES (5,6,7,8);
>
> This will cause subscription errors like:
> 2024-05-22 11:53:19.098 AEST [16226] ERROR: logical replication
> target relation "public.t" is missing replicated column: "d"
>
> ~~~~
>
> I think the following small change will remove any ambiguity:
>
> BEFORE
> If no column list is specified, any columns added later are
> automatically replicated.
>
> SUGGESTION
> If no column list is specified, any columns added to the table later
> are automatically replicated.
>
> ~~
>
> I attached a small patch to make the above change.
>
> Thoughts?

A minor suggestion, the rest looks good:
It would enhance clarity to include a line break following "If no
column list is specified, any columns added to the table later are":
- If no column list is specified, any columns added later are automatically
+ If no column list is specified, any columns added to the table
later are automatically
replicated. This means that having a column list which names all columns

Regards,
Vignesh

In response to

Browse pgsql-docs by date

  From Date Subject
Next Message Tom Lane 2024-05-29 16:32:54 Re: 17beta1 source download does not include docs & manpages
Previous Message Marcel Hofstetter 2024-05-29 06:53:29 Re: 17beta1 source download does not include docs & manpages

Browse pgsql-hackers by date

  From Date Subject
Next Message ISHAN CHHANGANI . 2024-05-29 10:45:09 Timeout gets unset on a syntax error.
Previous Message vignesh C 2024-05-29 09:39:54 Re: Improving the latch handling between logical replication launcher and worker processes.