Re: Show WAL write and fsync stats in pg_stat_io

From: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, "bharath(dot)rupireddyforpostgres(at)gmail(dot)com" <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Show WAL write and fsync stats in pg_stat_io
Date: 2025-02-05 20:02:15
Message-ID: CAN55FZ07b1AvMbHrxc9HUobK+eNQ-cNTrSBqQvgzkoLXYuY9gw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Wed, 5 Feb 2025 at 21:32, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Michael Paquier <michael(at)paquier(dot)xyz> writes:
> > At the end, we want this patch and this data, and my benchmarcking is
> > not showing much differences even if going through a workload with
> > many pages, so I've used the version relying entirely on
> > track_io_timing and applied it.
>
> Locally, the test added by this commit fails like so:
>
> diff -U3 /home/postgres/pgsql/src/test/regress/expected/stats.out /home/postgres
> /pgsql/src/test/regress/results/stats.out
> --- /home/postgres/pgsql/src/test/regress/expected/stats.out 2025-02-04 12:33
> :07.456393545 -0500
> +++ /home/postgres/pgsql/src/test/regress/results/stats.out 2025-02-05 13:08
> :30.605638432 -0500
> @@ -886,7 +886,7 @@
> WHERE context = 'normal' AND object = 'wal';
> ?column?
> ----------
> - t
> + f
> (1 row)
>
> -----
>
> This is pretty repeatable (not perfectly so) in a build with
> --enable-debug --enable-cassert --enable-tap-tests --with-llvm
> but it usually passes without --with-llvm. System is fairly
> up-to-date RHEL8 on x86_64. No idea why the buildfarm isn't
> unhappy. Any pointers where to look?

Thanks for the report!

My thoughts when adding this test was that startup process must do the
WAL read I/O while server is starting, i.e.:

'''
startup process ->
InitWalRecovery ->
ReadCheckpointRecord ->
ReadRecord ->
XLogPrefetcherReadRecord ->
lrq_complete_lsn ->
lrq_prefetch ->
lrq->next = XLogPrefetcherNextBlock ->
XLogReadAhead ->
XLogDecodeNextRecord ->
ReadPageInternal ->
state->routine.page_read = XLogPageRead()
'''

Is there a chance that the function chain above does not get triggered
while running the stats.sql test?

--
Regards,
Nazir Bilal Yavuz
Microsoft

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2025-02-05 20:05:31 Re: Windows CFBot is broken because ecpg dec_test.c error
Previous Message Andres Freund 2025-02-05 19:46:05 Re: Failed assertion with jit enabled