Re: logical replication worker accesses catalogs in error context callback

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: logical replication worker accesses catalogs in error context callback
Date: 2021-01-12 09:33:44
Message-ID: CALj2ACXti8yjgZ1KE3Cdck_Pju3-1gv_XtsQGoojPQa0JL3iYw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 12, 2021 at 9:40 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Mon, Jan 11, 2021 at 4:46 PM Bharath Rupireddy
> <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> >
> > On Mon, Jan 11, 2021 at 10:56 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > Agreed. Attached the updated patch.
> >
> > Thanks for the updated patch. Looks like the comment crosses the 80
> > char limit, I have corrected it. And also changed the variable name
> > from remotetypeid to remotetypid, so that logicalrep_typmap_gettypname
> > will not cross the 80 char limit. And also added a commit message.
> > Attaching v3 patch, please have a look. Both make check and make
> > check-world passes.
>
> Thanks! The change looks good to me.

Thanks!

> > > > I quickly searched in places where error callbacks are being used, I
> > > > think we need a similar kind of fix for conversion_error_callback in
> > > > postgres_fdw.c, because get_attname and get_rel_name are being used
> > > > which do catalogue lookups. ISTM that all the other error callbacks
> > > > are good in the sense that they are not doing sys catalogue lookups.
> > >
> > > Indeed. If we need to disallow the catalog lookup during executing
> > > error callbacks we might want to have an assertion checking that in
> > > SearchCatCacheInternal(), in addition to Assert(IsTransactionState()).
> >
> > I tried to add(as attached in
> > v1-0001-Avoid-Sys-Cache-Lookups-in-Error-Callbacks.patch) the
> > Assert(!error_context_stack); in SearchCatCacheInternal, initdb itself
> > fails [1]. That means, we are doing a bunch of catalogue access from
> > error context callbacks. Given this, I'm not quite sure whether we can
> > have such an assertion in SearchCatCacheInternal.
>
> I think checking !error_context_stack is not a correct check if we're
> executing an error context callback since it's a stack to store
> callbacks. It can be non-NULL by setting an error callback, see
> setup_parser_errposition_callback() for instance.

Thanks! Yes, you are right, even though we are not processing the
actual error callback, the error_context_stack can be non-null, hence
the Assert(!error_context_stack); doesn't make any sense.

> Probably we need to check if (recursion_depth > 0) and elevel.
> Attached a patch for that as an example.

IIUC, we must be knowing in SearchCatCacheInternal, whether errstart
is called with elevel >= ERROR and we have recursion_depth > 0. If
both are true, then the assertion in SearchCatCacheInternal should
fail. I see that in your patch, elevel is being fetched from
errordata, that's fine. What I'm not quite clear is the following
assumption:

+ /* If we doesn't set any error data yet, assume it's an error */
+ if (errordata_stack_depth == -1)
+ return true;

Is it always true that we are in error processing when
errordata_stack_depth is -1, what happens if errordata_stack_depth <
-1? Maybe I'm missing something.

IMHO, adding an assertion in SearchCatCacheInternal(which is a most
generic code part within the server) with a few error context global
variables may not be always safe. Because we might miss using the
error context variables properly. Instead, we could add a comment in
ErrorContextCallback structure saying something like, "it's not
recommended to access any system catalogues within an error context
callback when the callback is expected to be called while processing
an error, because the transaction might have been broken in that
case." And let the future callback developers take care of it.

Thoughts?

As I said earlier [1], currently only two(there could be more) error
context callbacks access the sys catalogues, one is in
slot_store_error_callback which will be fixed with your patch. Another
is in conversion_error_callback, we can also fix this by storing the
relname, attname and other required info in ConversionLocation,
something like the attached. Please have a look.

Thoughts?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
v1-0001-Avoid-Catalogue-Accesses-In-conversion_error_call.patch application/x-patch 7.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message japin 2021-01-12 09:47:23 Re: Logical Replication - behavior of ALTER PUBLICATION .. DROP TABLE and ALTER SUBSCRIPTION .. REFRESH PUBLICATION
Previous Message Noah Misch 2021-01-12 08:49:53 Re: Spurious "apparent wraparound" via SimpleLruTruncate() rounding