From: | Tomas Vondra <tomas(at)vondra(dot)me> |
---|---|
To: | Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Draft for basic NUMA observability |
Date: | 2025-04-01 20:17:16 |
Message-ID: | c7ce0a21-18dd-4682-bcd1-f9424bb7e2b9@vondra.me |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
I've spent a bit of time reviewing this. In general I haven't found
anything I'd call a bug, but here's a couple comments for v18 ... Most
of this is in separate "review" commits, with a couple exceptions.
1) Please update the commit messages, with proper formatting, etc. I
tried to do that in the attached v19, but please go through that, add
relevant details, update list of reviewers, etc. The subject should not
be overly long, etc.
2) I don't think we need "libnuma for NUMA awareness" in configure, I'd
use just "libnuma support" similar to other libraries.
3) I don't think we need pg_numa.h to have this:
extern PGDLLIMPORT Datum pg_numa_available(PG_FUNCTION_ARGS);
AFAICS we don't have any SQL functions exposed as PGDLLIMPORT, so why
would it be necessary here? It's enough to have a prototype in .c file.
4) Improved .sgml to have acronym/productname in a couple places.
5) I don't think the comment for pg_buffercache_init_entries() is very
useful. That it's helper for pg_buffercache_pages() tells me nothing
about how to use it, what the arguments are, etc.
6) IMHO pg_buffercache_numa_prepare_ptrs() would deserve a better
comment too. I mean, there's no info about what the arguments are, which
arguments are input or output, etc. And it only discussed one option
(block page < memory page), but what happens in the other case? The
formulas with blk2page/blk2pageoff are not quite clear to me (I'm not
saying it's wrong).
However, it seems rather suspicious that pages_per_blk is calculated as
float, and pg_buffercache_numa_prepare_ptrs() then does this:
for (size_t j = 0; j < pages_per_blk; j++)
{ ... }
I mean, isn't this vulnerable to rounding errors, which might trigger
some weird behavior? If not, it'd be good to add a comment why this is
fine, it confuses me a lot. I personally would probably prefer doing
just integer arithmetic here.
7) This para in the docs seems self-contradictory:
<para>
The <function>pg_buffercache_numa_pages()</function> provides the same
information
as <function>pg_buffercache_pages()</function> but is slower because
it also
provides the <acronym>NUMA</acronym> node ID per shared buffer entry.
The <structname>pg_buffercache_numa</structname> view wraps the
function for
convenient use.
</para>
I mean, "provides the same information, but is slower because it
provides different information" is strange. I understand the point, but
maybe rephrase somehow?
8) Why is pg_numa_available() marked as volatile? Should not change in a
running cluster, no?
9) I noticed the new SGML docs have IDs with mixed "-" and "_". Maybe
let's not do that.
<sect2 id="pgbuffercache-pg-buffercache_numa">
10) I think it'd be good to mention/explain why move_pages is used
instead of get_mempolicy - more efficient with batching, etc. This would
be useful both in the commit message and before the move_pages call (and
in general to explain why pg_buffercache_numa_prepare_ptrs prepares the
pointers like this etc.).
11) This could use UINT64_FORMAT, instead of a cast:
elog(DEBUG1, "NUMA: os_page_count=%lu os_page_size=%zu
pages_per_blk=%.2f",
(unsigned long) os_page_count, os_page_size, pages_per_blk);
regards
--
Tomas Vondra
Attachment | Content-Type | Size |
---|---|---|
v19-0001-Add-support-for-basic-NUMA-awareness.patch | text/x-patch | 21.8 KB |
v19-0002-pgindent.patch | text/x-patch | 711 bytes |
v19-0003-review.patch | text/x-patch | 3.9 KB |
v19-0004-pg_buffercache-split-pg_buffercache_pages-into-p.patch | text/x-patch | 12.9 KB |
v19-0005-review.patch | text/x-patch | 1.6 KB |
v19-0006-Add-pg_buffercache_numa-view-with-NUMA-node-info.patch | text/x-patch | 19.1 KB |
v19-0007-review.patch | text/x-patch | 1.0 KB |
v19-0008-Add-pg_shmem_numa_allocations-to-show-NUMA-node.patch | text/x-patch | 16.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Gustafsson | 2025-04-01 20:22:07 | Re: Adding support for SSLKEYLOGFILE in the frontend |
Previous Message | Andres Freund | 2025-04-01 20:17:05 | Re: TEMP_CONFIG vs test_aio |