Re: Forget close an open relation in ReorderBufferProcessTXN()

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>
Cc: Japin Li <japinli(at)hotmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Marco Nenciarini <marco(dot)nenciarini(at)2ndquadrant(dot)it>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Forget close an open relation in ReorderBufferProcessTXN()
Date: 2021-05-18 11:59:32
Message-ID: CAA4eK1J4Ofw_ZprzbL3EZhr=Hw7tRRA60EWLEuhN-LdzLBnvdw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 18, 2021 at 1:29 PM osumi(dot)takamichi(at)fujitsu(dot)com
<osumi(dot)takamichi(at)fujitsu(dot)com> wrote:
>
> On Monday, May 17, 2021 6:45 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > We allow taking locks on system catalogs, so why prohibit
> > user_catalog_tables? However, I agree that if we want plugins to acquire the
> > lock on user_catalog_tables then we should either prohibit decoding of such
> > relations or do something else to avoid deadlock hazards.
> OK.
>
> Although we have not concluded the range of logical decoding of user_catalog_table
> (like we should exclude TRUNCATE command only or all operations on that type of table),
> I'm worried that disallowing the logical decoding of user_catalog_table produces
> the deadlock still. It's because disabling it by itself does not affect the
> lock taken by TRUNCATE command. What I have in mind is an example below.
>
> (1) plugin (e.g. pgoutput) is designed to take a lock on user_catalog_table.
> (2) logical replication is set up in synchronous mode.
> (3) TRUNCATE command takes an access exclusive lock on the user_catalog_table.
> (4) This time, we don't do anything for the TRUNCATE decoding.
> (5) the plugin tries to take a lock on the truncated table
> but, it can't due to the lock by TRUNCATE command.
>

If you skip decoding of truncate then we won't invoke plugin API so
step 5 will be skipped.

> I was not sure that the place where the plugin takes the lock is in truncate_cb
> or somewhere else not directly related to decoding of the user_catalog_table itself,
> so I might be wrong. However, in this case,
> the solution would be not disabling the decoding of user_catalog_table
> but prohibiting TRUNCATE command on user_catalog_table in synchronous_mode.
> If this is true, I need to extend an output plugin and simulate the deadlock first
> and remove it by fixing the TRUNCATE side. Thoughts ?
>

I suggest not spending too much time reproducing this because it is
quite clear that it will lead to deadlock if the plugin acquires lock
on user_catalog_table and we allow decoding of truncate. But if you
want to see how that happens you can try as well.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chapman Flack 2021-05-18 12:05:32 Re: allow specifying direct role membership in pg_hba.conf
Previous Message houzj.fnst@fujitsu.com 2021-05-18 11:41:02 Re: Parallel scan with SubTransGetTopmostTransaction assert coredump