From: | torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Cc: | Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, rjuju123(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, Bruce Momjian <bruce(at)momjian(dot)us> |
Subject: | Re: RFC: Allow EXPLAIN to Output Page Fault Information |
Date: | 2025-01-27 09:04:59 |
Message-ID: | 2c9d6eaf26df17bec13bb03bf1e9bcbb@oss.nttdata.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On Tue, Jan 7, 2025 at 5:09 AM Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>
wrote:
> The core functionality works well in my opinion. I think it makes sense
> to spend the effort to move this from PoC quality to something
> committable. Below some of the things that are necessary to do that
> after an initial pass over the code (and trying it out):
Attached a Patch.
On Tue, Jan 7, 2025 at 10:42 PM Atsushi Torikoshi
<torikoshia(dot)tech(at)gmail(dot)com> wrote:
>> 9. I think this division by 2 could use some explanation in a comment.
>> I
>> understand that you're doing this because linux divides its
>> original
>> bytes using 512 bytes[2] and your additional factor of 2 gets that
>> to
>> 1024 bytes. But that's not clear immediately from the code.
>>
>> I'm also not convinced that 512 is the blocksize if this logic is
>> even correct on every platform. I'm wondering if maybe we should
>> simply show the blocks after all.
>
> Maybe so. I'll look into this and then decide the unit.
I looked up the manuals for the following operating systems, as
documented in [1], and it seems that all of them—except Windows—support
getrusage(2) and return ru_inblock/ru_oublock:
Linux, Windows, FreeBSD, OpenBSD, NetBSD, DragonFlyBSD, macOS, AIX,
Solaris, and illumos.
However, I’m unsure if the unit of these values is consistently 512KB
across all operating systems.
Additionally, I’m concerned that the timing of when these metrics are
incremented might vary between OSs.
For example, on Linux, it seems that ru_oublock is incremented when a
page is dirtied, as it’s calculated by dividing write_bytes [2] by 9.
I’m not sure if other operating systems behave the same way.
That said, they all represent the number of storage I/O operations
performed or to be performed.
Therefore, I believe it would be reasonable to report the raw values
as-is, as they should still be useful for understanding storage I/O
activity.
Example output:
=# explain analyze select max(a), max(b) from t_big_ul;
(..snip..)
Planning:
Buffers: shared hit=31 read=54
Storage I/O: read=2744 times write=0 times
Planning Time: 30.685 ms
Execution:
Storage I/O: read=2563600 times write=0 times
Execution Time: 1685.272 ms
[1] https://www.postgresql.org/docs/devel/supported-platforms.html
[2] https://www.kernel.org/doc/Documentation/filesystems/proc.txt
--
Regards,
Atsushi Torikoshi
Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.
Attachment | Content-Type | Size |
---|---|---|
v1-0001-Add-storage-I-O-tracking-to-BUFFERS-option.patch | text/x-diff | 33.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Jelte Fennema-Nio | 2025-01-27 09:05:54 | Re: New process of getting changes into the commitfest app |
Previous Message | Jelte Fennema-Nio | 2025-01-27 09:03:07 | Re: New process of getting changes into the commitfest app |