Re: BUG #17716: walsender process hang while decoding 'DROP PUBLICATION' XLOG

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Bowen Shi <zxwsbg12138(at)gmail(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: BUG #17716: walsender process hang while decoding 'DROP PUBLICATION' XLOG
Date: 2022-12-20 06:10:05
Message-ID: CAJpy0uAELGvoeOydnB+VzLmqxgX-TsBEOBmUL-ysS-BqTVxuiQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hello,
The idea looks good to me. For 'relation schema cache (pgoutput one)', on
receiving invalidation msg for one hash-value, we invalidate the complete
cache as there is no way to find an entry corresponding to that
hash-value and thus your fix-proposal will make good difference. But I feel
it makes sense on HEAD as well.

This complete cache invalidation happens multiple times even on HEAD (10k
times for the given case). This cache is mostly empty in given test-case,
but consider the case where we have huge number of publications and
subscriptions (to make this cache have huge number of entries) and then we
try to drop 1 large publication with say 40k-50k tables, in that case
we might see slowness while traversing and invalidating the concerned cache
on HEAD as well. The test case with increased magnitude can be tried for
HEAD once to see if we need it on HEAD or not.

thanks
Shveta

On Mon, Dec 19, 2022 at 5:52 PM Bowen Shi <zxwsbg12138(at)gmail(dot)com> wrote:

> Hello,
> Thanks for your advice. I make some tests and this problem can't be
> reproduced in PG 14+ version. I think adding a new XLOG type will help
> resolve this problem. But I think the following patch may be helpful
> in the PG 13 version.
>
> The invalidation contains two parts: pgoutput and relfilenodeMap. We
> have no way to optimize relfilenodeMap part , since it has been
> discussed in previous mails
>
> https://www.postgresql.org/message-id/CANDwggKYveEtXjXjqHA6RL3AKSHMsQyfRY6bK+NqhAWJyw8psQ@mail.gmail.com
> .
>
> However, I'd like to contribute a patch to fix pgoutput part. We can skip
> invalidating caches after first time with a lazy tag and this works.
> It almost doubles the walsender performance while decoding this XLOG.
>
> I use the test in the last email and reduce the number of relations in
> publications to 1000, the test result is following:
>
> Before optimization: 76 min
> After optimization: 35 min
>
> Though the result is not good enough, I think this patch is still worthy.
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message John Naylor 2022-12-20 10:13:55 Re: BUG #17725: Sefault when seg_in() called with a large argument
Previous Message PG Bug reporting form 2022-12-20 03:35:22 BUG #17725: Sefault when seg_in() called with a large argument