Re: subscription disable_on_error not working after ALTER SUBSCRIPTION set bad conninfo

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>
Subject: Re: subscription disable_on_error not working after ALTER SUBSCRIPTION set bad conninfo
Date: 2024-01-18 01:54:48
Message-ID: CAD21AoBF0EGGshy0LGdfRoBUz+m9QZ-U24eJP6bheamh9cMmoA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 18, 2024 at 8:16 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Hi,
>
> I had reported a possible subscription 'disable_on_error' bug found
> while reviewing another patch.
>
> I am including my initial report and Nisha's analysis again here so
> that this topic has its own thread.
>
> ==================
> INITIAL REPORT [1]
> ==================
>
> ...
> I see now that any ALTER of the subscription's connection, even to
> some value that fails, will restart a new worker (like ALTER of any
> other subscription parameters). For a bad connection, it will continue
> to relaunch-worker/ERROR over and over. e.g.
>
> ----------
> test_sub=# \r2024-01-17 09:34:28.665 AEDT [11274] LOG: logical
> replication apply worker for subscription "sub4" has started
> 2024-01-17 09:34:28.666 AEDT [11274] ERROR: could not connect to the
> publisher: invalid port number: "-1"
> 2024-01-17 09:34:28.667 AEDT [928] LOG: background worker "logical
> replication apply worker" (PID 11274) exited with exit code 1
> 2024-01-17 09:34:33.669 AEDT [11391] LOG: logical replication apply
> worker for subscription "sub4" has started
> 2024-01-17 09:34:33.669 AEDT [11391] ERROR: could not connect to the
> publisher: invalid port number: "-1"
> 2024-01-17 09:34:33.670 AEDT [928] LOG: background worker "logical
> replication apply worker" (PID 11391) exited with exit code 1
> etc...
> ----------
>
> While experimenting with the bad connection ALTER I also tried setting
> 'disable_on_error' like below:
>
> ALTER SUBSCRIPTION sub4 SET (disable_on_error);
> ALTER SUBSCRIPTION sub4 CONNECTION 'port = -1';
>
> ...but here the subscription did not become DISABLED as I expected it
> would do on the next connection error iteration. It remains enabled
> and just continues to loop relaunch/ERROR indefinitely same as before.
>
> That looks like it may be a bug. Thoughts?

Although we can improve it to handle this case too, I'm not sure it's
a bug. The doc says[1]:

Specifies whether the subscription should be automatically disabled if
any errors are detected by subscription workers during data
replication from the publisher.

When an apply worker is trying to establish a connection, it's not
replicating data from the publisher.

Regards,

[1] https://www.postgresql.org/docs/devel/sql-createsubscription.html#SQL-CREATESUBSCRIPTION-PARAMS-WITH-DISABLE-ON-ERROR

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Cramer 2024-01-18 02:07:19 compiling postgres on windows arm using meson
Previous Message Andy Fan 2024-01-18 01:41:41 Re: Strange Bitmapset manipulation in DiscreteKnapsack()