From: | Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, "bharath(dot)rupireddyforpostgres(at)gmail(dot)com" <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Subject: | Re: Show WAL write and fsync stats in pg_stat_io |
Date: | 2025-01-29 11:57:21 |
Message-ID: | CAN55FZ3BK=6O6qcxzPH-4boThBJw0y73sWH-t2vDe_55LLiwGQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On Tue, 28 Jan 2025 at 07:23, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Mon, Jan 27, 2025 at 05:13:39PM +0300, Nazir Bilal Yavuz wrote:
> > On Mon, 27 Jan 2025 at 16:59, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> wrote:
> >>
> >> On Mon, 27 Jan 2025 at 03:52, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> >> We use PendingWalStats.wal_[ write | sync ]_time only to show timings
> >> on the pg_stat_wal view, right? And now these pg_stat_wal.wal_[ write
> >> | sync ]_time datas are fetched from the pg_stat_io view when the
> >> track_wal_io_timing is on. So, I think it is correct to remove these.
>
> As you say, removing the counters in the second patch does not matter
> as if you are going to combine them and..
>
> >>
> >> I made a mistake while splitting the patches. The places where
> >> 'PendingWalStats.wal_[ write | sync ]_time are incremented (the code
> >> piece you shared)' are removed in 0002 (0001 now), but they should be
> >> removed in 0003 (0002 now) instead. This is corrected in v11.
>
> My issue was in the first patch that should not have removed them. My
> apologies for the confusion, I should have pointed out that this was
> likely an incorrect rebase and/or patch split.
Ah, I see. Thanks for clarifying!
> > If we agree with removing PendingWalStats.wal_[ write | sync ]_time
> > variables, then it would make sense to remove PgStat_PendingWalStats
> > struct completely. We have that struct because [1] it is cheap to
> > store PendingWalStats.wal_[ write | sync ]_time as instr_time instead
> > of PgStat_Counter.
> >
> > [1] ca7b3c4c00
>
> I agree that some simplification would be nice because it also makes
> Bertrand's patch around [1] to not have some special handling with
> PgStat_PendingWalStats without us losing monitoring capabilities, I
> hope. So maximizing simplifications before integrating more
> capabilities should make the whole implementation effort easier.
>
> What you doing in 0001 is a first good step towards this goal, as this
> also plugs in a few things for backend statistics with the calls to
> pgstat_count_io_op[_time]().
I agree. Do you think that we need to do this simplification in this
thread or does it need its own thread?
>
> + /* Report pending statistics to the cumulative stats system */
> + pgstat_flush_io(false);
>
> This has been discussed previously under a pgstat_report_wal() call.
> Why do you need this specific call? Perhaps this should be documented
> as a comment?
Without this change, stats in the startup process would only get
reported during shutdown of the startup process, so I added this but I
was wrong. It was already fixed [1], so yes; we do not need it now.
This is removed in v12.
>
> + if (io_object == IOOBJECT_WAL)
> + return track_wal_io_timing
>
> Hmm. Andres has commented in the past that we want pg_stat_io to
> server as a central place for all the I/O statistics. Thinking more
> about that, I am not really convinced that we actually need to make
> this area of the code in pgstat_io.c rely on two GUCs. How about
> simplifying things so as we just rely on track_io_timing for
> everything, without creating a strange dependency on the IOObject with
> more routines like pgstat_should_track_io_time()? I'd really want
> less of these GUCs, not more of them with cross-dependencies depending
> on the stats kinds we are dealing with. If we replace the timings
> from pg_stat_wal with the ones in pg_stat_io, we should be in a good
> position to remove track_wal_io_timing entirely, as well. This has
> the merit of making your patch a lot simpler, meaning extra bonus
> points.
I agree with you but it was discussed before in this thread [2]. It
was decided to use both track_wal_io_timing and track_io_timing
because of the overhead that track_wal_io_timing creates but we can
still re-discuss it. Do you think that this discussion needs its own
thread?
If we continue to discuss it in this thread, I am in favor of removing
track_wal_io_timing and using track_io_timing for all types of I/Os.
Like you said, this cross-dependency makes things more complex than
they used to be. Downside of removing track_wal_io_timing is affecting
people who:
1- Want to track timings of only WAL I/Os.
2- Want to track timings of all IOs except WAL I/Os.
I think the first group is more important than the second because
track_io_timing already creates overhead.
One additional thing is that I think track_io_timing is a general
word. When it exists, I do not expect there to be another GUC like
track_wal_io_timing to track WAL I/Os' timings.
[1] e3cb1a586c
[2] https://www.postgresql.org/message-id/ZUmuJ7P8THHz03nx%40paquier.xyz
--
Regards,
Nazir Bilal Yavuz
Microsoft
Attachment | Content-Type | Size |
---|---|---|
v12-0001-Add-WAL-I-O-stats-to-both-pg_stat_io-view-and-pe.patch | text/x-patch | 22.6 KB |
v12-0002-Fetch-timing-columns-from-pg_stat_io-in-the-pg_s.patch | text/x-patch | 9.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Vitaly Davydov | 2025-01-29 12:00:54 | Re: An improvement of ProcessTwoPhaseBuffer logic |
Previous Message | Peter Eisentraut | 2025-01-29 11:35:00 | Re: further #include cleanup (IWYU) |