Re: PG12 autovac issues

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Justin King <kingpin867(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-general(at)lists(dot)postgresql(dot)org, kgrittn(at)gmail(dot)com
Subject: Re: PG12 autovac issues
Date: 2020-03-27 19:23:03
Message-ID: 20200327192303.GA20785@nol
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-general

On Fri, Mar 27, 2020 at 02:12:04PM +0900, Michael Paquier wrote:
> On Thu, Mar 26, 2020 at 09:46:47AM -0500, Justin King wrote:
> > Nope, it was just these tables that were looping over and over while
> > nothing else was getting autovac'd. I'm happy to share the full log
> > if you'd like.
>
> Thanks, that could help. If that's very large, it could be a problem
> to send that to the lists, but you could send me directly a link to
> it and I'll try to extract more information for the lists. While
> testing for reproducing the issue, I have noticed that basically one
> set of catalog tables happened to see this "skipping redundant" log.
> And I am wondering if we have a match with the set of catalog tables
> looping.
>
> > I did have to remove it from this state, but I can undo my workaround
> > and, undoubtedly, it'll end up back there. Let me know if there's
> > something specific you'd like me to provide when it happens!
>
> For now I think it's fine. Note that Julien and I have an environment
> where the issue can be reproduced easily (it takes roughly 12 hours
> until the wraparound cutoffs are reached with the benchmark and
> settings used), and we are checking things using a patched instance
> with 2aa6e33 reverted. I think that we are accumulating enough
> evidence that this change was not a good idea anyway thanks to the
> information you sent, so likely we'll finish first by a revert of
> 2aa6e33 from the master and REL_12_STABLE branches, before looking at
> the issues with the catalogs for those anti-wraparound and
> non-aggressive jobs (this looks like a relcache issue with the so-said
> catalogs).

FTR we reached the 200M transaxtion earlier, and I can see multiple logs of the
form "automatic vacuum to prevent wraparound", so non-aggressive antiwraparound
autovacuum, all on shared relations.

As those vacuum weren't skipped, autovacuum didn't get stuck in a loop on those
and continue its work normally. This happened ~ 4h ago, didn't ocurred again
while the 200M threshold was reached again multiple time.

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Justin King 2020-03-27 22:10:03 Re: PG12 autovac issues
Previous Message Rui DeSousa 2020-03-27 06:11:08 Re: Vacuum Issues

Browse pgsql-general by date

  From Date Subject
Next Message Bellrose, Brian 2020-03-27 20:10:22 Promoting Hot standby after running select pg_xlog_replay_pause();
Previous Message Tom Lane 2020-03-27 15:46:34 Re: \COPY to accept non UTF-8 chars in CHAR columns