Re: Optionally automatically disable logical replication subscriptions on error

From: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "Smith, Peter" <peters(at)fast(dot)au(dot)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Optionally automatically disable logical replication subscriptions on error
Date: 2021-06-22 02:29:38
Message-ID: ACC5CDA8-7318-4752-B65E-479EFFA72667@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Jun 21, 2021, at 5:57 PM, Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> #5. Document to refer to the logs. All ERROR details are already in
> the logs, and this seems to me the intuitive place to look for them.

My original motivation came from writing TAP tests to check that the permissions systems would properly deny the apply worker when running under a non-superuser role. The idea is that the user with the responsibility for managing subscriptions won't have enough privilege to read the logs. Whatever information that user needs (if any) must be someplace else.

> Searching for specific errors becomes difficult programmatically (is
> this really a problem other than complex TAP tests?).

I believe there is a problem, because I remain skeptical that these errors will be both existent and rare. Either you've configured your system correctly and you get zero of these, or you've misconfigured it and you get some non-zero number of them. I don't see any reason to assume that number will be small.

The best way to deal with that is to be able to tell the system what to do with them, like "if the error has this error code and the error message matches this regular expression, then do this, else do that." That's why I think allowing triggers to be created on subscriptions makes the most sense (though is probably the hardest system being proposed so far.)

> But here there
> is no risk of missing or insufficient information captured in the log
> files ("but still there will be some information related to ERROR
> which we wanted the user to see unless we ask them to refer to logs
> for that." [Amit-4}).

Not only is there a problem if the user doesn't have permission to view the logs, but also, if we automatically disable the subscription until the error is manually cleared, the logs might be rotated out of existence before the user takes any action. In that case, the logs will be entirely missing, and not even the error message will remain. At least with the patch I submitted, the error message will remain, though I take Amit's point that there are deficiencies in handling parallel tablesync workers, etc.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message torikoshia 2021-06-22 02:30:31 Re: RFC: Logging plan of the running query
Previous Message Japin Li 2021-06-22 02:07:37 Re: Fix for segfault in logical replication on master