Re: Errors with schema migration and logical replication — expected?

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Mike Lissner <mlissner(at)michaeljaylissner(dot)com>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Errors with schema migration and logical replication — expected?
Date: 2018-12-12 15:11:08
Message-ID: 2f98e0cb-0a41-fecb-0ef0-16f288d52e50@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 12/12/18 12:15 AM, Mike Lissner wrote:
>
>
> On Tue, Dec 11, 2018 at 3:10 PM Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com
> <mailto:adrian(dot)klaver(at)aklaver(dot)com>> wrote:
>
>
> >     Well, I was able to fix this by briefly allowing nulls on the
> >     subscriber, letting it catch up with the publisher, setting all
> >     nulls to empty strings (a Django convention), and then
> disallowing
> >     nulls again. After letting it catch up, there were 118 nulls
> on the
> >     subscriber in this column:
>
> So recap_sequence_number is not actually a number, it is a code?
>
>
> It has sequential values, but they're not necessarily numbers.
>
>
> >
> >     I appreciate all the responses. I'm scared to say so, but I think
> >     this is a bug in logical replication. Somehow a null value
> appeared
> >     at the subscriber that was never in the publisher.
> >
> >     I also still have this question/suggestion from my first email:
> >
> >      > Is the process for schema migrations documented somewhere
> beyond
> >     the above?
>
> Not that I know of. It might help, if possible, to detail the steps in
> the migration. Also what program you used to do it. Given that is
> Django
> I am assuming some combination of migrate, makemigrations and/or
> sqlmigrate.
>
>
> Pretty simple/standard, I think:
>  - Changed the code.
>  - Generated the migration using manage.py makemigration
>  - Generated the SQL using sqlmigrate
>  - Ran the migration using manage.py migrate on the master and using
> psql on the replica

The only thing I can think of involves this sequence on the subscriber:

ALTER TABLE "search_docketentry" ADD COLUMN "recap_sequence_number"
varchar(50) DEFAULT '' NOT NULL;
ALTER TABLE "search_docketentry" ALTER COLUMN "recap_sequence_number"
DROP DEFAULT;

and then this:

https://www.postgresql.org/docs/11/logical-replication-subscription.html

"Columns of a table are also matched by name. A different order of
columns in the target table is allowed, but the column types have to
match. The target table can have additional columns not provided by the
published table. Those will be filled with their default values."

https://www.postgresql.org/docs/10/sql-createtable.html

"If there is no default for a column, then the default is null."

So the subscriber finished the migration first, as alluded to in an
earlier post. There is no data for recap_sequence_number coming from the
provider so Postgres place holds the data with NULL until such time as
the migration on the provider finishes and actual data for
recap_sequence_number starts flowing.

Going forward, options I see:

1) Making sure there is a DEFAULT other then NULL for a NOT NULL column.

2) Stop the replication and do the migration scripts on both provider
and subscriber until they both complete and then start replication again.

> Mike

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Moreno Andreo 2018-12-12 15:39:11 Re: REVOKE to an user that doesn't exist
Previous Message Moreno Andreo 2018-12-12 15:09:08 Re: REVOKE to an user that doesn't exist