Re: [BUG]: the walsender does not update its IO statistics until it exits

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [BUG]: the walsender does not update its IO statistics until it exits
Date: 2025-02-28 10:39:31
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Fri, Feb 28, 2025 at 02:41:34PM +0900, Michael Paquier wrote:
> On Wed, Feb 26, 2025 at 09:48:50AM +0000, Bertrand Drouvot wrote:
> > Yeah I think that makes sense, done that way in the attached.
> >
> > Speaking about physical walsender, I moved the test to instead
> > (would also fail without the fix).
> Hmm. I was doing some more checks with this patch, and on closer look
> I am wondering if the location you have chosen for the stats reports
> is too aggressive: this requires a LWLock for the WAL sender backend
> type taken in exclusive mode, with each step of WalSndLoop() taken
> roughly each time a record or a batch of records is sent. A single
> installcheck with a primary/standby setup can lead to up to 50k stats
> report calls.

Yeah, what I can observe (installcheck with a primary/standby):

- extras flush are called about 70K times.

Among those 70K:

- about 575 are going after the "have_iostats" check in pgstat_io_flush_cb()
- about 575 are going after the PendingBackendStats pg_memory_is_all_zeros()
check in pgstat_flush_backend() (makes sense due to the above)
- about 575 are going after the PendingBackendStats.pending_io pg_memory_is_all_zeros()
check in pgstat_flush_backend_entry_io() (makes sense due to the above)

It means that only a very few of them are "really" flushing IO stats.

> With smaller records, the loop can become hotter, can't it? Also,
> there can be a high number of WAL senders on a single node, and I've
> heard of some customers with complex logical decoding deployments with
> dozens of logical WAL senders. Isn't there a risk of having this code
> path become a point of contention? It seems to me that we should
> benchmark this change more carefully, perhaps even reduce the
> frequency of the report calls.

That sounds a good idea to measure the impact of those extra calls and see
if we'd need to mitigate the impacts. I'll collect some data.


Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services:

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ranier Vilela 2025-02-28 11:03:22 Re: Small memory fixes for pg_createsubcriber
Previous Message Shubham Khanna 2025-02-28 10:29:49 Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility.