Re: per backend WAL statistics

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Subject: Re: per backend WAL statistics
Date: 2025-03-04 08:48:28
Message-ID: Z8a+XFNm5RV4LTy1@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Tue, Mar 04, 2025 at 09:28:27AM +0900, Michael Paquier wrote:
> On Mon, Mar 03, 2025 at 09:17:30AM +0000, Bertrand Drouvot wrote:
> > On Mon, Mar 03, 2025 at 10:48:23AM +0900, Michael Paquier wrote:
> >> Something that's still not quite right is that the WAL receiver and
> >> the WAL summarizer do not call pgstat_report_wal() at all, so we don't
> >> report much data and we expect these processes to run continuously.
> >> The location where to report stats for the WAL summarizer is simple,
> >> even if the system is aggressive with WAL this is never called more
> >> than a couple of times per seconds, like the WAL writer:
> >
> > Same as above, that sounds right after a quick look.
>
> Attached is a patch for this set of issues for the WAL receiver, the
> WAL summarizer and the WAL writer.

Thanks for the patch!

=== 1

@@ -1543,6 +1544,9 @@ summarizer_read_local_xlog_page(XLogReaderState *state,
HandleWalSummarizerInterrupts();
summarizer_wait_for_wal();

+ /* report pending statistics to the cumulative stats system */
+ pgstat_report_wal(false);

s/report/Report/ and s/system/system./? to be consistent with the other single
line comments around.

=== 2

+ /*
+ * Report pending statistics to the cumulative stats
+ * system. This location is useful for the report as it is
+ * not within a tight loop in the WAL receiver, which
+ * would bloat requests to pgstats, while also making sure
+ * that the reports happen at least each time a status
+ * update is sent.
+ */

Yeah, I also think that's the right location.

Nit: s/would/could/?

=== 3

+ /*
+ * Some BackendTypes only perform IO under IOOBJECT_WAL, hence exclude all
+ * rows for all the other objects for these.
+ */
+ if ((bktype == B_WAL_SUMMARIZER || bktype == B_WAL_RECEIVER ||
+ bktype == B_WAL_WRITER) && io_object != IOOBJECT_WAL)
+ return false;

I think that makes sense and it removes 15 lines out of 86. This function is
"hard" to read/parse from my point of view. Maybe we could re-write it in a
simpler way but that's outside the purpose of this thread.

=== 4

+ WHERE backend_type = 'walsummarizer' AND object = 'wal'});

The object = 'wal' is not needed (thanks to === 3), maybe we can remove this
filtering?

Also what about adding a test to check that sum(reads) is NULL where object != 'wal'?

=== 5

Same remark as above for the WAL receiver (excepts that sum(writes) is NULL where
object != 'wal').

> All that should be fixed before looking at the remaining patch for the
> WAL stats at backend level

Not sure as that would also prevent the other backend types to report their WAL
statistics if the above is not fixed.

>, so what do you think about the attached?

That's pretty straightforward, so yeah we can wait that it goes in before
moving forward with the WAL stats at backend level.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2025-03-04 08:50:36 Re: Accidental assignment instead of compare in GetOperatorFromCompareType?
Previous Message Peter Eisentraut 2025-03-04 08:28:37 Re: bug: ALTER TABLE ADD VIRTUAL GENERATED COLUMN with table rewrite