Broken logical replication

From: Игорь Выскорко <vyskorko(dot)igor(at)yandex(dot)ru>
To: "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Broken logical replication
Date: 2022-01-24 07:48:56
Message-ID: 843241643010478@mail.yandex.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

<div><div><div>Giving 2 postgres servers:</div><div>1. Master -  PostgreSQL 10.16</div><div>2. Slave - PostgreSQL 13.5</div><div> </div><div>Logical replication was configured and worked fine between them. </div><div>At some point software (russian accounting soft: 1C) which is using master DB, decided to do some db-related tasks. I know only about full reindexing - all indexes are built again - probably there were more DDL...</div><div>Ok, when it's finished I found that replication was broken:</div><div>1. Publication object exists but all publication tables were dropped. I mean not tables itself but: </div><div>select * from pg_publication_tables gave 0 rows. Ok, I've added them again - didn't help</div><div>2. In log I've found repeated bunch of records:</div><div>янв 20 23:33:29 testpg postgres[6291]: [9-1] 2022-01-20 16:33:29.087 UTC [6291] ERROR:  replication slot "subscr" is active for PID 3984</div><div>янв 20 23:33:34 testpg postgres[6295]: [7-1] 2022-01-20 16:33:34.101 UTC [6295] LOG:  connection received: host=192.168.7.225 port=43428</div><div>янв 20 23:33:34 testpg postgres[6295]: [8-1] 2022-01-20 16:33:34.102 UTC [6295] LOG:  replication connection authorized: user=rep_user</div><div>янв 20 23:33:34 testpg postgres[6295]: [9-1] 2022-01-20 16:33:34.104 UTC [6295] ERROR:  replication slot "subscr" is active for PID 3984</div><div>янв 20 23:33:39 testpg postgres[6298]: [7-1] 2022-01-20 16:33:39.117 UTC [6298] LOG:  connection received: host=192.168.7.225 port=43470</div><div>янв 20 23:33:39 testpg postgres[6298]: [8-1] 2022-01-20 16:33:39.118 UTC [6298] LOG:  replication connection authorized: user=rep_user</div><div>янв 20 23:33:39 testpg postgres[6298]: [9-1] 2022-01-20 16:33:39.537 UTC [6298] LOG:  starting logical decoding for slot "subscr"</div><div>янв 20 23:33:39 testpg postgres[6298]: [9-2] 2022-01-20 16:33:39.537 UTC [6298] DETAIL:  streaming transactions committing after 1E0/3449C020, reading WAL from 1D9/EEAD19E8</div><div>янв 20 23:33:39 testpg postgres[6298]: [10-1] 2022-01-20 16:33:39.538 UTC [6298] LOG:  logical decoding found consistent point at 1D9/EEAD19E8</div><div>янв 20 23:33:39 testpg postgres[6298]: [10-2] 2022-01-20 16:33:39.538 UTC [6298] DETAIL:  There are no running transactions.</div><div> </div><div>3. Replication status:</div><div>db=# select * from pg_stat_replication ;</div><div>-[ RECORD 1 ]----+------------------------------</div><div>pid              | 6298</div><div>usesysid         | 16384</div><div>usename          | rep_user</div><div>application_name | subscr</div><div>client_addr      | 192.168.7.225</div><div>client_hostname  |</div><div>client_port      | 43470</div><div>backend_start    | 2022-01-20 16:33:39.117019+00</div><div>backend_xmin     |</div><div>state            | catchup</div><div>sent_lsn         | 1DE/E849BFD8</div><div>write_lsn        | 1E0/3449C020</div><div>flush_lsn        | 1E0/3449C020</div><div>replay_lsn       | 1E0/3449C020</div><div>write_lag        |</div><div>flush_lag        |</div><div>replay_lag       |</div><div>sync_priority    | 0</div><div>sync_state       | async</div><div> </div><div>sent_lsn is lower than write_lsn, what? Is it legal?)</div><div> </div><div>4. pg_wal directory was bloated - it's ok I suppose because of opened replication slot. </div><div> </div><div>So, my question here: is it possible to fix replication without full restart (truncating tables in slave and copy all data)?</div></div></div>

Attachment Content-Type Size
unknown_filename text/html 3.5 KB

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Paul van der Linden 2022-01-24 09:17:40 Re: Cannot find hstore operator
Previous Message Tom Lane 2022-01-23 15:22:06 Re: Cannot find hstore operator