From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <msawada(at)postgresql(dot)org> |
Subject: | Re: xid_wraparound tests intermittent failure. |
Date: | 2024-07-22 16:07:32 |
Message-ID: | CAD21AoAmimDuc_xexq0FF+gt2Vy=wmLK==a-frQUhg-rv=+D6g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Jul 21, 2024 at 2:36 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> > On 2024-07-21 Su 1:34 PM, Tom Lane wrote:
> >> Locally, I've not managed to reproduce the failure yet; so perhaps
> >> there is some platform dependency. What are you testing on?
>
> > Linux ub22arm 5.15.0-116-generic #126-Ubuntu SMP Mon Jul 1 10:08:40 UTC
> > 2024 aarch64 aarch64 aarch64 GNU/Linux
> > It's a VM running on UTM/Apple Silicon
>
> Hmm, doesn't sound like that ought to be slow.
>
> I did manage to reproduce dodo's failures by running xid_wraparound
> manually on mamba's very slow host:
>
> $ time make -s installcheck PROVE_FLAGS=--timer
> # +++ tap install-check in src/test/modules/xid_wraparound +++
> [13:37:49] t/001_emergency_vacuum.pl .. 1/? # poll_query_until timed out executing this query:
> #
> # SELECT NOT EXISTS (
> # SELECT *
> # FROM pg_database
> # WHERE age(datfrozenxid) > current_setting('autovacuum_freeze_max_age')::int)
> #
> # expecting this output:
> # t
> # last actual query output:
> # f
> # with stderr:
> # Tests were run but no plan was declared and done_testing() was not seen.
> # Looks like your test exited with 4 just after 1.
> [13:37:49] t/001_emergency_vacuum.pl .. Dubious, test returned 4 (wstat 1024, 0x400)
> All 1 subtests passed
> [14:06:51] t/002_limits.pl ............ 2/? # Tests were run but no plan was declared and done_testing() was not seen.
> # Looks like your test exited with 29 just after 2.
> [14:06:51] t/002_limits.pl ............ Dubious, test returned 29 (wstat 7424, 0x1d00)
> All 2 subtests passed
> [14:31:16] t/003_wraparounds.pl ....... ok 7564763 ms ( 0.00 usr 0.01 sys + 13.82 cusr 9.26 csys = 23.09 CPU)
> [16:37:21]
>
> Test Summary Report
> -------------------
> t/001_emergency_vacuum.pl (Wstat: 1024 (exited 4) Tests: 1 Failed: 0)
> Non-zero exit status: 4
> Parse errors: No plan found in TAP output
> t/002_limits.pl (Wstat: 7424 (exited 29) Tests: 2 Failed: 0)
> Non-zero exit status: 29
> Parse errors: No plan found in TAP output
> Files=3, Tests=4, 10772 wallclock secs ( 0.15 usr 0.06 sys + 58.50 cusr 59.88 csys = 118.59 CPU)
> Result: FAIL
> make: *** [../../../../src/makefiles/pgxs.mk:442: installcheck] Error 1
> 10772.99 real 59.34 user 60.14 sys
>
> Each of those two failures looks just like something that dodo has
> shown at one time or another. So it's at least plausible that
> "slow machine" is the whole explanation. I'm still wondering
> though if there's some effect that causes the test's runtime to
> be unstable in itself, sometimes leading to timeouts.
>
Since the server writes a lot of logs during the xid_wraparound test,
"slow disk" could also be a reason.
Looking at dodo's failures, it seems that while it passes
module-xid_wraparound-check, all failures happened only during
testmodules-install-check-C. Can we check the server logs written
during xid_wraparound test in testmodules-install-check-C? I thought
the following link is the server logs but since it seems there were no
autovacuum logs I suspected there is another log file:
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Justin Pryzby | 2024-07-22 16:28:23 | Re: warn if GUC set to an invalid shared library |
Previous Message | Corey Huinker | 2024-07-22 16:05:34 | Re: Statistics Import and Export |