From: | "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com> |
---|---|
To: | 'Amit Kapila' <amit(dot)kapila16(at)gmail(dot)com>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> |
Cc: | Japin Li <japinli(at)hotmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | RE: Truncate in synchronous logical replication failed |
Date: | 2021-04-13 13:54:06 |
Message-ID: | OSBPR01MB4888098965EECA7359D5A693ED4F9@OSBPR01MB4888.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Monday, April 12, 2021 3:58 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Mon, Apr 12, 2021 at 10:03 AM osumi(dot)takamichi(at)fujitsu(dot)com
> <osumi(dot)takamichi(at)fujitsu(dot)com> wrote:
> > but if we take a measure to fix the doc, we have to be careful for the
> > description, because when we remove the primary keys of 'test' tables on the
> scenario in [1], we don't have this issue.
> > It means TRUNCATE in synchronous logical replication is not always
> blocked.
> >
>
> The problem happens only when we try to fetch IDENTITY_KEY attributes
> because pgoutput uses RelationGetIndexAttrBitmap() to get that information
> which locks the required indexes. Now, because TRUNCATE has already
> acquired an exclusive lock on the index, it seems to create a sort of deadlock
> where the actual Truncate operation waits for logical replication of operation to
> complete and logical replication waits for actual Truncate operation to finish.
>
> Do we really need to use RelationGetIndexAttrBitmap() to build IDENTITY_KEY
> attributes? During decoding, we don't even lock the main relation, we just scan
> the system table and build that information using a historic snapshot. Can't we
> do something similar here?
I think we can build the IDENTITY_KEY attributes with NoLock
instead of calling RelationGetIndexAttrBitmap().
When we trace back the caller side of logicalrep_write_attrs(),
doing the thing equivalent to RelationGetIndexAttrBitmap()
for INDEX_ATTR_BITMAP_IDENTITY_KEY impacts only pgoutput_truncate.
OTOH, I can't find codes similar to RelationGetIndexAttrBitmap()
in pgoutput_* functions and in the file of relcache.c.
Therefore, I'd like to discuss how to address the hang.
My first idea is to extract some parts of RelationGetIndexAttrBitmap()
only for INDEX_ATTR_BITMAP_IDENTITY_KEY and implement those
either in a logicalrep_write_attrs() or as a new function.
RelationGetIndexAttrBitmap() has 'restart' label for goto statement
in order to ensure to return up-to-date attribute bitmaps, so
I prefer having a new function when we choose this direction.
Having that goto in logicalrep_write_attrs() makes it a little bit messy, I felt.
The other direction might be to extend RelationGetIndexAttrBitmap's function definition
to accept lockmode to give NoLock from logicalrep_write_attrs().
But, this change impacts on other several callers so is not as good as the first direction above, I think.
If someone has any better idea, please let me know.
Best Regards,
Takamichi Osumi
From | Date | Subject | |
---|---|---|---|
Next Message | Zhihong Yu | 2021-04-13 14:07:52 | Re: Have I found an interval arithmetic bug? |
Previous Message | Virender Singla | 2021-04-13 13:49:42 | Re: vacuum freeze - possible improvements |