Re: Show WAL write and fsync stats in pg_stat_io

From: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Melanie Plageman <melanieplageman(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, "bharath(dot)rupireddyforpostgres(at)gmail(dot)com" <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
Subject: Re: Show WAL write and fsync stats in pg_stat_io
Date: 2025-01-24 08:31:02
Message-ID: CAN55FZ19F7K9OKCzp4jCdBB7-xMEbdEVWpzFcDvCz0na-oB3vQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Wed, 22 Jan 2025 at 03:14, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Thu, Jan 16, 2025 at 11:40:51AM +0300, Nazir Bilal Yavuz wrote:
> > I encountered another problem while rebasing the patch. The problem is
> > basically we do not expect any pending stats while restoring the stats
> > at the initdb. However, WAL IOs (WAL read and WAL init IOs for now)
> > may happen before restoring the stats, so we end up having pending
> > stats before restoring them and that causes initdb to fail.
>
> On top of 4feba03d8b92, I've reused something close to the patch you
> have posted previously in case, and the issue with allocations for
> pending stats should be gone.

Yes, they are fixed; thanks!

> Could it be possible to post a new version of the patch? You should
> be able to reuse pgstat_count_backend_io_op[_time]() for your work
> with WAL data in pg_stat_io if you need a low-level control of things,
> but I suspect that calling pgstat_count_io_op() & the other should be
> enough to get the job done with a new IOObject.

I think there is only one problem remaining now. walsenders have stats
to report with this patch and they may shutdown after the
checkpointer, which causes '027_stream_regress.pl' test to fail.
Andres is already working on fixing that issue [1],
'027_stream_regress.pl' test passes after applying Andres' proposed
fix.

v9 is rebased and attached as three patches. The first one is a
squashed patch for the current version of Andres' proposed fix to pass
the CI, the second one is for adding WAL stats to pg_stat_io and the
third one is for fetching timing columns from pg_stat_io in the
pg_stat_wal view.

There is a change in the main patch (0002). Now, stats are being
flushed after the main loop in the PerformWalRecovery() function in
the xlogrecovery.c file. Stats were flushed in the main loop before
but I thought that might be costly so moved it to after main loop.

[1] postgr.es/m/flat/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu%40m3cfzxicm5kp

--
Regards,
Nazir Bilal Yavuz
Microsoft

Attachment Content-Type Size
v9-0001-Squash-reorder-shutdown-sequence-patches.patch text/x-patch 27.6 KB
v9-0002-Add-WAL-I-O-stats-to-both-pg_stat_io-view-and-per.patch text/x-patch 23.0 KB
v9-0003-Fetch-timing-columns-from-pg_stat_io-in-the-pg_st.patch text/x-patch 8.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Japin Li 2025-01-24 09:07:02 Re: [RFC] Lock-free XLog Reservation from WAL
Previous Message Ashutosh Bapat 2025-01-24 08:27:54 Re: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided.