Quick Links

Re: Draft for basic NUMA observability

From:	Tomas Vondra <tomas(at)vondra(dot)me>
To:	Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Cc:	Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Draft for basic NUMA observability
Date:	2025-04-07 19:51:17
Message-ID:	c0d02e4e-6eeb-47d9-9971-f65aa7264ab4@vondra.me
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 4/7/25 20:11, Bertrand Drouvot wrote:
> Hi,
>
> On Mon, Apr 07, 2025 at 12:42:21PM -0400, Andres Freund wrote:
>> Hi,
>>
>> On 2025-04-07 18:36:24 +0200, Tomas Vondra wrote:
>>
>> I was thinking of checking if the BufferDesc indicates BM_VALID or
>> BM_TAG_VALID.
>
> Yeah, that's what I did propose in [1] (when we were speaking about get_mempolicy())
> and I think that would make sense as future improvement.
>
>>
>>
>>> I think we need to decide whether the current patches are good enough
>>> for PG18, with the current behavior, and then maybe improve that in
>>> PG19.
>>
>> I think as long as the docs mention this with <note> or <warning> it's ok for
>> now.
>
> +1
>
> A few comments on v27:
>
> === 1
>
> pg_buffercache_numa() reports the node ID as "nodeid" while pg_shmem_allocations_numa()
> reports it as node_id. Maybe we should use the same "naming" in both.
>

This was renamed in v28 to "numa_node" in both parts.

> === 2
>
> postgres=# select count(*) from pg_buffercache;
> count
> -------
> 65536
> (1 row)
>
> but
>
> postgres=# select count(*) from pg_buffercache_numa;
> count
> -------
> 64
> (1 row)
>
> with:
>
> postgres=# show block_size;
> block_size
> ------------
> 2048
>
> and Hugepagesize: 2048 kB.
>
> and
>
> postgres=# show shared_buffers;
> shared_buffers
> ----------------
> 128MB
> (1 row)
>
> And even if for testing I set:
>
> - funcctx->max_calls = idx;
> + funcctx->max_calls = 65536;
>
> then I start to see weird results:
>
> postgres=# select count(*) from pg_buffercache_numa where bufferid not in (select bufferid from pg_buffercache);
> count
> -------
> 65472
> (1 row)
>
> So it looks like that the new way to iterate on the buffers that has been introduced
> in v26/v27 has some issue?
>

Yeah, the calculations of the end pointers were wrong - we need to round
up (using TYPEALIGN()) when calculating number of pages, and just add
BLCKSZ (without any rounding) when calculating end of buffer. The 0004
fixes this for me (I tried this with various blocksizes / page sizes).

Thanks for noticing this!

regards

--
Tomas Vondra

Attachment	Content-Type	Size
v29-0001-Add-support-for-basic-NUMA-awareness.patch	text/x-patch	22.1 KB
v29-0002-Introduce-pg_shmem_allocations_numa-view.patch	text/x-patch	18.9 KB
v29-0003-Add-pg_buffercache_numa-view-with-NUMA-node-info.patch	text/x-patch	22.0 KB
v29-0004-fixup.patch	text/x-patch	1.9 KB

In response to

Re: Draft for basic NUMA observability at 2025-04-07 18:11:44 from Bertrand Drouvot

Responses

Re: Draft for basic NUMA observability at 2025-04-07 21:01:17 from Jakub Wartak

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alvaro Herrera	2025-04-07 19:52:06	Re: Support NOT VALID / VALIDATE constraint options for named NOT NULL constraints
Previous Message	Hannu Krosing	2025-04-07 19:48:20	Re: Adding pg_dump flag for parallel export to pipes