[BUG]: the walsender does not update its IO statistics until it exits

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: [BUG]: the walsender does not update its IO statistics until it exits
Date: 2025-02-25 13:42:08
Message-ID: Z73IsKBceoVd4t55@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

while doing some tests for [1], I observed that $SUBJECT.

To observe this behavior on master:

1. create a logical replication slot

postgres=# SELECT * FROM pg_create_logical_replication_slot('logical_slot', 'test_decoding', false, true);
slot_name | lsn
--------------+------------
logical_slot | 0/40749508
(1 row)

2. create a table and add some data

postgres=# create table bdt (a int);
CREATE TABLE
postgres=# insert into bdt select a from generate_series(1,10000) a ;
INSERT 0 10000

3. starts pg_recvlogical that way

pg_recvlogical -d postgres -S logical_slot -f - --no-loop --start

4. query pg_stat_io

postgres=# select backend_type,object,context,reads,read_bytes from pg_stat_io where backend_type = 'walsender';
backend_type | object | context | reads | read_bytes
--------------+---------------+-----------+-------+------------
walsender | relation | bulkread | 0 | 0
walsender | relation | bulkwrite | 0 | 0
walsender | relation | init | 0 | 0
walsender | relation | normal | 6 | 49152
walsender | relation | vacuum | 0 | 0
walsender | temp relation | normal | 0 | 0
walsender | wal | init | |
walsender | wal | normal | 0 | 0
(8 rows)

The non zeros stats that we see here are due to the pgstat_report_stat() call in
PostgresMain() but not to the walsender decoding activity per say (proof is that
you can see that the wal object values are empty while it certainly had to read
some WAL).

5. Once ctrl-c is done for pg_recvlogical then we get:

postgres=# select backend_type,object,context,reads,read_bytes from pg_stat_io where backend_type = 'walsender';
backend_type | object | context | reads | read_bytes
--------------+---------------+-----------+-------+------------
walsender | relation | bulkread | 0 | 0
walsender | relation | bulkwrite | 0 | 0
walsender | relation | init | 0 | 0
walsender | relation | normal | 9 | 73728
walsender | relation | vacuum | 0 | 0
walsender | temp relation | normal | 0 | 0
walsender | wal | init | |
walsender | wal | normal | 98 | 793856
(8 rows)

Now we can see that the numbers increased for the relation object and that we
get non zeros numbers for the wal object too (which makes fully sense).

With the attached patch applied, we would get the same numbers already in
step 4. (means the stats are flushed without the need to wait for the walsender
to exit).

Remarks:

R1. The extra flush are done in WalSndLoop(): I believe this is the right place
for them.
R2. A test is added in 035_standby_logical_decoding.pl: while this TAP test
is already racy ([2]) that looks like a good place as we don't want pg_recvlogical
to stop/exit.
R3. The test can also be back-patched till 16_STABLE as 035_standby_logical_decoding.pl
has been introduced in 16 (and so do pg_stat_io).
R4. The test fails if the extra flushs are not applied/done, which makes fully
sense.

[1]: https://www.postgresql.org/message-id/flat/Z3zqc4o09dM/Ezyz%40ip-10-97-1-34.eu-west-3.compute.internal
[2]: https://www.postgresql.org/message-id/Z6oRgmD8m7zBo732%40ip-10-97-1-34.eu-west-3.compute.internal

Looking forward to your feedback,

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v1-0001-Flush-the-IO-statistics-of-active-walsenders.patch text/x-diff 2.8 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Sabino Mullane 2025-02-25 13:46:46 Re: Redact user password on pg_stat_statements
Previous Message Fujii Masao 2025-02-25 13:31:51 Re: Extend postgres_fdw_get_connections to return remote backend pid