From: | Atsushi Torikoshi <torikoshia(dot)tech(at)gmail(dot)com> |
---|---|
To: | Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, rjuju123(at)gmail(dot)com |
Cc: | torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, tgl(at)sss(dot)pgh(dot)pa(dot)us, Bruce Momjian <bruce(at)momjian(dot)us>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: RFC: Allow EXPLAIN to Output Page Fault Information |
Date: | 2025-01-07 13:41:47 |
Message-ID: | CAM6-o=BE=oewSsdNKrbbNjR2muzQR49STXsuta+Wpq6CSyxTag@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jan 6, 2025 at 6:59 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> Hi,
>
> On Mon, Jan 06, 2025 at 06:49:06PM +0900, torikoshia wrote:
> >
> > > **However, I think the general direction has merit**: Changing this
> > > patch to
> > > use `ru_inblock`/`ru_oublock` gives very useful insights. `ru_inblock`
> > > is 0 when everything is in page cache, and it is very high when stuff
> is
> > > not. I was only hacking around and basically did this:
> > >
> > > s/ru_minflt/ru_inblock/g
> > > s/ru_majflt/ru_oublock/g
> > > [...]
> > > Also, maybe there's some better way
> > > of getting read/write numbers for the current process than
> > > ru_inblock/ru_oublock (but this one seems to work at least reasonably
> > > well).
> >
> > Maybe, but as far as using getrusage(), ru_inblock and ru_outblock seem
> the
> > best.
>
> FWIW that's the counters we've been using in pg_stat_kcache / powa to
> compute
> disk IO and "postgres shared buffers hit VS OS cache hit VS disk read" hit
> ratio for many years and we didn't get any problem report for that.
Thanks for sharing the information!
On Tue, Jan 7, 2025 at 5:09 AM Jelte Fennema-Nio <postgres(at)jeltef(dot)nl> wrote:
> On Mon Jan 6, 2025 at 10:49 AM CET, torikoshia wrote:
> > On Tue, Dec 31, 2024 at 7:57 AM Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>
> > Updated the PoC patch to calculate them by KB:
> >
> > =# EXPLAIN (ANALYZE, STORAGEIO) SELECT * FROM pgbench_accounts;
> > QUERY PLAN
> >
> >
> ---------------------------------------------------------------------------------------------------------------------------------
> > Seq Scan on pgbench_accounts (cost=0.00..263935.35 rows=10000035
> > width=97) (actual time=1.447..3900.279 rows=10000000 loops=1)
> > Buffers: shared hit=2587 read=161348
> > Planning Time: 0.367 ms
> > Execution:
> > Storage I/O: read=1291856 KB write=0 KB
> > Execution Time: 4353.253 ms
> > (6 rows)
> >
> >
>
> The core functionality works well in my opinion. I think it makes sense
> to spend the effort to move this from PoC quality to something
> committable.
Thanks for the comment and review!
If there are no other opinions, I will make a patch based on the direction
of the current PoC patch.
> 4. I think this setting should be enabled by default for ANALYZE, just
> like BUFFERS is now since c2a4078e[1].
>
> 5. I'm wondering whether this option deserves its own EXPLAIN option, or
> if it should simply be made part of BUFFERS.
Although this is not information about PostgreSQL buffers, I now feel that
making this addition part of BUFFERS is attractive, since people using
BUFFERS would also sometimes want to know about the storage I/O.
Since BUFFERS is now ON by default with EXPLAIN ANALYZE, I am concerned
about the performance impact.
However, if it is limited to just twice—once at the start and once at the
end—for the planning phase, execution phase, and each parallel worker, I
believe the impact is relatively small.
> 9. I think this division by 2 could use some explanation in a comment. I
> understand that you're doing this because linux divides its original
> bytes using 512 bytes[2] and your additional factor of 2 gets that to
> 1024 bytes. But that's not clear immediately from the code.
>
> I'm also not convinced that 512 is the blocksize if this logic is
> even correct on every platform. I'm wondering if maybe we should
> simply show the blocks after all.
>
Maybe so. I'll look into this and then decide the unit.
For the other comments, I plan to address them as you suggested.
Regards,
Atsushi Torikoshi
From | Date | Subject | |
---|---|---|---|
Next Message | Ilia Evdokimov | 2025-01-07 13:51:17 | Re: Sample rate added to pg_stat_statements |
Previous Message | Yura Sokolov | 2025-01-07 13:40:41 | Re: Meson bug in detection of 64 atomics |