Quick Links

Re: New instability in stats regression test

From:	Michael Paquier <michael(at)paquier(dot)xyz>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: New instability in stats regression test
Date:	2023-11-27 22:41:31
Message-ID:	ZWUbG9pP_5Myuero@paquier.xyz
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, Nov 27, 2023 at 02:01:51PM -0500, Tom Lane wrote:
> The problem as I see it is that this test:
>
> SELECT :io_stats_post_reset < :io_stats_pre_reset;
>
> requires an assumption that less I/O has happened since the commanded
> reset action than happened before it (extending back to the previous
> reset, or cluster start). Since concurrent processes might be doing
> I/O, this has a race condition. If we are slow enough about obtaining
> :io_stats_post_reset, the test *will* fail eventually. But the shorter
> the distance back to the previous reset, the bigger the odds of
> observable trouble; thus Michael's concern that adding more reset
> tests in future would increase the risk of failure.

The new reset added just before checking the contents of pg_stat_io
reduces :io_stats_pre_reset from 7M to 50k. That's a threshold easy
to reach if you have a checkpoint or an autovacuum running in
parallel. I have not checked the buildfarm logs in details, but I'd
put a coin on a checkpoint triggered by time if the issue happened on
a slow machine.
--
Michael

In response to

Re: New instability in stats regression test at 2023-11-27 19:01:51 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michael Paquier	2023-11-27 22:53:37	Re: GUC names in messages
Previous Message	Michael Paquier	2023-11-27 22:36:48	Re: Adding facility for injection points (or probe points?) for more advanced tests