Re: xid_wraparound tests intermittent failure.

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: xid_wraparound tests intermittent failure.
Date: 2024-07-29 21:25:11
Message-ID: CAD21AoBHZUmxuPnUcQ+Fen9z_1-DWLdFDzoAd=Qc4XbU9gf7CQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jul 27, 2024 at 1:06 PM Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>
>
> On 2024-07-26 Fr 1:46 PM, Masahiko Sawada wrote:
> > On Thu, Jul 25, 2024 at 6:52 PM Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> >>
> >> On 2024-07-25 Th 3:40 PM, Masahiko Sawada wrote:
> >>
> >> On Thu, Jul 25, 2024 at 11:06 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >>
> >> On Thu, Jul 25, 2024 at 10:56 AM Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> >>
> >> On 2024-07-23 Tu 6:59 PM, Masahiko Sawada wrote:
> >>
> >> See <https://bitbucket.org/adunstan/rotfang-fdw/downloads/xid-wraparound-result.tar.bz2>
> >>
> >>
> >> The failure logs are from a run where both tests 1 and 2 failed.
> >>
> >> Thank you for sharing the logs.
> >>
> >> I think that the problem seems to match what Alexander Lakhin
> >> mentioned[1]. Probably we can fix such a race condition somehow but
> >> I'm not sure it's worth it as setting autovacuum = off and
> >> autovacuum_max_workers = 1 (or a low number) is an extremely rare
> >> case. I think it would be better to stabilize these tests. One idea is
> >> to turn the autovacuum GUC parameter on while setting
> >> autovacuum_enabled = off for each table. That way, we can ensure that
> >> autovacuum workers are launched. And I think it seems to align real
> >> use cases.
> >>
> >> Regards,
> >>
> >> [1] https://www.postgresql.org/message-id/02373ec3-50c6-df5a-0d65-5b9b1c0c86d6%40gmail.com
> >>
> >>
> >> OK, do you want to propose a patch?
> >>
> >> Yes, I'll prepare and share it soon.
> >>
> >> I've attached the patch. Could you please test if the patch fixes the
> >> instability you observed?
> >>
> >> Since we turn off autovacuum on all three tests and we wait for
> >> autovacuum to complete processing databases, these tests potentially
> >> have a similar (but lower) risk. So I modified these tests to turn it
> >> on so we can ensure the autovacuum runs periodically.
> >>
> >>
> >> I assume you actually meant to remove the "autovacuum = off" in 003_wraparound.pl. With that change in your patch I retried my test, but on iteration 100 out of 100 it failed on test 002_limits.pl.
> >>
> > I think we need to remove the "autovacuum = off' also in 002_limits.pl
> > as it waits for autovacuum to process both template0 and template1
> > databases. Just to be clear, the failure happened even without
> > "autovacuum = off"?
> >
>
> The attached patch, a slight modification of yours, removes "autovacuum
> = off" for all three tests, and given that a set of 200 runs was clean
> for me.

Oh I missed that I left "autovacuum = off' for some reason in 002
test. Thank you for attaching the patch, it looks good to me.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2024-07-29 21:50:08 Re: Remove last traces of HPPA support
Previous Message Euler Taveira 2024-07-29 21:11:32 Re: speed up a logical replica setup