Re: Add pg_buffercache_evict_all() and pg_buffercache_mark_dirty[_all]() functions

From: Joseph Koshakow <koshy44(at)gmail(dot)com>
To: Aidar Imamov <a(dot)imamov(at)postgrespro(dot)ru>
Cc: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Add pg_buffercache_evict_all() and pg_buffercache_mark_dirty[_all]() functions
Date: 2025-03-20 22:25:07
Message-ID: CAAvxfHeHBL6XxypexiSu0geH3hkHiFX=ApUaeaU+hgQEfjAwxQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi I am working with Aidar to give a review and I am also a beginner
reviewer.

> From 813e5ec0da4c65970b4b1ce2ec2918e4652da9ab Mon Sep 17 00:00:00 2001
> From: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
> Date: Fri, 20 Dec 2024 14:06:47 +0300
> Subject: [PATCH v1 1/2] Add pg_buffercache_evict_all() function for
testing
> */
> bool
> -EvictUnpinnedBuffer(Buffer buf)
> +EvictUnpinnedBuffer(Buffer buf, bool *flushed)

I think you should update the comment above this function to include
details on the new `flushed` argument.

> diff --git a/doc/src/sgml/pgbuffercache.sgml
b/doc/src/sgml/pgbuffercache.sgml
> index 802a5112d77..83950ca5cce 100644
> --- a/doc/src/sgml/pgbuffercache.sgml
> +++ b/doc/src/sgml/pgbuffercache.sgml
> @@ -31,8 +31,10 @@
> This module provides the <function>pg_buffercache_pages()</function>
> function (wrapped in the <structname>pg_buffercache</structname> view),
> the <function>pg_buffercache_summary()</function> function, the
> - <function>pg_buffercache_usage_counts()</function> function and
> - the <function>pg_buffercache_evict()</function> function.
> + <function>pg_buffercache_usage_counts()</function> function, the
> + <function>pg_buffercache_evict()</function>, the
> + <function>pg_buffercache_evict_relation()</function>, and the
> + <function>pg_buffercache_evict_all()</function> function.

All the other functions have indexterm tags above this, should we add
indexterms for the new functions?

> + <sect2 id="pgbuffercache-pg-buffercache-evict-relation">
> + <title>The <structname>pg_buffercache_evict_relation</structname>
Function</title>
> + <para>
> + The <function>pg_buffercache_evict_relation()</function> function is
very similar
> + to <function>pg_buffercache_evict()</function> function. The
difference is that
> + <function>pg_buffercache_evict_relation()</function> takes a relation
> + identifier instead of buffer identifier. Then, it tries to evict all
> + buffers in that relation. The function is intended for developer
testing only.
> + </para>
> + </sect2>
> +
> + <sect2 id="pgbuffercache-pg-buffercache-evict-all">
> + <title>The <structname>pg_buffercache_evict_all</structname>
Function</title>
> + <para>
> + The <function>pg_buffercache_evict_all()</function> function is very
similar
> + to <function>pg_buffercache_evict()</function> function. The
difference is,
> + the <function>pg_buffercache_evict_all()</function> does not take
argument;
> + instead it tries to evict all buffers in the buffer pool. The
function is
> + intended for developer testing only.
> + </para>
> + </sect2>

The other difference is that these new functions have an additional
output argument to indicate if the buffer was flushed which we probably
want to mention here.

Also I think it's more gramatically correct to say "the <fn-name>
function" or just "<fn-name>". A couple of times you say "<fn-name>
function" or "the <fn-name>".

Also I think the third to last sentence should end with "... does not
take **an** argument" or "... does not take argument**s**".

> +CREATE FUNCTION pg_buffercache_evict_relation(
> + IN regclass,
> + IN fork text default 'main',
> + OUT buffers_evicted int4,
> + OUT buffers_flushed int4)
> +AS 'MODULE_PATHNAME', 'pg_buffercache_evict_relation'
> +LANGUAGE C PARALLEL SAFE VOLATILE;
> +
> +CREATE FUNCTION pg_buffercache_evict_all(
> + OUT buffers_evicted int4,
> + OUT buffers_flushed int4)
> +AS 'MODULE_PATHNAME', 'pg_buffercache_evict_all'
> +LANGUAGE C PARALLEL SAFE VOLATILE;

Does it make sense to also update pg_buffercache_evict to also return
a bool if the buffer was flushed? Or is that too much of a breaking
change?

> + /* Lock the header and check if it's valid. */
> + buf_state = LockBufHdr(desc);
> + if ((buf_state & BM_VALID) == 0)
> + {
> + UnlockBufHdr(desc, buf_state);
> + return false;
> + }

EvictUnpinnedBuffer first checks if the buffer is locked before
attempting to lock it. Do we need a similar check here?

/* Lock the header if it is not already locked. */
if (!buf_state)
buf_state = LockBufHdr(desc);

>> I don't think *flushed is necessarily accurate this way, as it might
detect
>> that it doesn't need to flush the data (via StartBufferIO() returning
false).
>
> You are right. It seems there is no way to get that information
> without editing the FlushBuffer(), right? And I think editing
> FlushBuffer() is not worth it. So, I can either remove it or explain
> in the docs that #buffers_flushed may not be completely accurate.

There's already a lot of caveats on EvictUnpinnedBuffer that it might
have unexpected results when run concurrently, so I think it's fine to
add one more.

Thanks,
Joe Koshakow

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2025-03-20 22:38:10 Re: RFC: Additional Directory for Extensions
Previous Message Andres Freund 2025-03-20 21:37:07 Re: md.c vs elog.c vs smgrreleaseall() in barrier