Make pg_stat_io view count IOs as bytes instead of blocks

From: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Make pg_stat_io view count IOs as bytes instead of blocks
Date: 2024-09-11 11:19:02
Message-ID: CAN55FZ103QUxKViKa86xJ3F_qnm4T3s2p__BdauqWXCcmdeMQg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Currently, in the pg_stat_io view, IOs are counted as blocks. However,
there are two issues with this approach:

1- The actual number of IO requests to the kernel is lower because IO
requests can be merged before sending the final request. Additionally, it
appears that all IOs are counted in block size.
2- Some IOs may not align with block size. For example, WAL read IOs are
done in variable bytes and it is not possible to correctly show these IOs
in the pg_stat_io view [1].

To address this, I propose showing the total number of IO requests to the
kernel (as smgr function calls) and the total number of bytes in the IO. To
implement this change, the op_bytes column will be removed from the
pg_stat_io view. Instead, the [reads | writes | extends] columns will track
the total number of IO requests, and newly added [read | write |
extend]_bytes columns will track the total number of bytes in the IO.

Example benefit of this change:

Running query [2], the result is:

╔═══════════════════╦══════════╦══════════╦═══════════════╗
║ backend_type ║ object ║ context ║ avg_io_blocks ║
╠═══════════════════╬══════════╬══════════╬═══════════════╣
║ client backend ║ relation ║ bulkread ║ 15.99 ║
╠═══════════════════╬══════════╬══════════╬═══════════════╣
║ background worker ║ relation ║ bulkread ║ 15.99 ║
╚═══════════════════╩══════════╩══════════╩═══════════════╝

You can rerun the same query [2] after setting io_combine_limit to 32 [3].
The result is:

╔═══════════════════╦══════════╦══════════╦═══════════════╗
║ backend_type ║ object ║ context ║ avg_io_blocks ║
╠═══════════════════╬══════════╬══════════╬═══════════════╣
║ client backend ║ relation ║ bulkread ║ 31.70 ║
╠═══════════════════╬══════════╬══════════╬═══════════════╣
║ background worker ║ relation ║ bulkread ║ 31.60 ║
╚═══════════════════╩══════════╩══════════╩═══════════════╝

I believe that having visibility into avg_io_[bytes | blocks] is valuable
information that could help optimize Postgres.

Any feedback would be appreciated.

[1]
https://www.postgresql.org/message-id/CAN55FZ1ny%2B3kpdm5X3nGZ2Jp3wxZO-744eFgxktS6YQ3%3DOKR-A%40mail.gmail.com

[2]
CREATE TABLE t as select i, repeat('a', 600) as filler from
generate_series(1, 10000000) as i;
SELECT pg_stat_reset_shared('io');
SELECT * FROM t WHERE i = 0;
SELECT backend_type, object, context, TRUNC((read_bytes / reads / (SELECT
current_setting('block_size')::numeric)), 2) as avg_io_blocks FROM
pg_stat_io WHERE reads > 0;

[3] SET io_combine_limit TO 32;

--
Regards,
Nazir Bilal Yavuz
Microsoft

Attachment Content-Type Size
v1-0001-Make-pg_stat_io-count-IOs-as-bytes-instead-of-blo.patch text/x-patch 21.9 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2024-09-11 11:25:08 Re: json_query conditional wrapper bug
Previous Message Bertrand Drouvot 2024-09-11 10:51:38 Re: Add contrib/pg_logicalsnapinspect