From: | Aidar Imamov <a(dot)imamov(at)postgrespro(dot)ru> |
---|---|
To: | Joseph Koshakow <koshy44(at)gmail(dot)com> |
Cc: | Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Add pg_buffercache_evict_all() and pg_buffercache_mark_dirty[_all]() functions |
Date: | 2025-03-23 19:16:38 |
Message-ID: | 76a550315baef9d7424b70144f1c6a2d@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Joseph Koshakow 2025-03-21 01:25:
> Hi I am working with Aidar to give a review and I am also a beginner
> reviewer.
>
>> From 813e5ec0da4c65970b4b1ce2ec2918e4652da9ab Mon Sep 17 00:00:00
> 2001
>> From: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
>> Date: Fri, 20 Dec 2024 14:06:47 +0300
>> Subject: [PATCH v1 1/2] Add pg_buffercache_evict_all() function for
> testing
>> */
>> bool
>> -EvictUnpinnedBuffer(Buffer buf)
>> +EvictUnpinnedBuffer(Buffer buf, bool *flushed)
>
> I think you should update the comment above this function to include
> details on the new `flushed` argument.
>
>> diff --git a/doc/src/sgml/pgbuffercache.sgml
> b/doc/src/sgml/pgbuffercache.sgml
>> index 802a5112d77..83950ca5cce 100644
>> --- a/doc/src/sgml/pgbuffercache.sgml
>> +++ b/doc/src/sgml/pgbuffercache.sgml
>> @@ -31,8 +31,10 @@
>> This module provides the
> <function>pg_buffercache_pages()</function>
>> function (wrapped in the <structname>pg_buffercache</structname>
> view),
>> the <function>pg_buffercache_summary()</function> function, the
>> - <function>pg_buffercache_usage_counts()</function> function and
>> - the <function>pg_buffercache_evict()</function> function.
>> + <function>pg_buffercache_usage_counts()</function> function, the
>> + <function>pg_buffercache_evict()</function>, the
>> + <function>pg_buffercache_evict_relation()</function>, and the
>> + <function>pg_buffercache_evict_all()</function> function.
>
> All the other functions have indexterm tags above this, should we add
> indexterms for the new functions?
>
>> + <sect2 id="pgbuffercache-pg-buffercache-evict-relation">
>> + <title>The <structname>pg_buffercache_evict_relation</structname>
> Function</title>
>> + <para>
>> + The <function>pg_buffercache_evict_relation()</function>
> function is very similar
>> + to <function>pg_buffercache_evict()</function> function. The
> difference is that
>> + <function>pg_buffercache_evict_relation()</function> takes a
> relation
>> + identifier instead of buffer identifier. Then, it tries to
> evict all
>> + buffers in that relation. The function is intended for
> developer testing only.
>> + </para>
>> + </sect2>
>> +
>> + <sect2 id="pgbuffercache-pg-buffercache-evict-all">
>> + <title>The <structname>pg_buffercache_evict_all</structname>
> Function</title>
>> + <para>
>> + The <function>pg_buffercache_evict_all()</function> function is
> very similar
>> + to <function>pg_buffercache_evict()</function> function. The
> difference is,
>> + the <function>pg_buffercache_evict_all()</function> does not
> take argument;
>> + instead it tries to evict all buffers in the buffer pool. The
> function is
>> + intended for developer testing only.
>> + </para>
>> + </sect2>
>
> The other difference is that these new functions have an additional
> output argument to indicate if the buffer was flushed which we
> probably
> want to mention here.
>
> Also I think it's more gramatically correct to say "the <fn-name>
> function" or just "<fn-name>". A couple of times you say "<fn-name>
> function" or "the <fn-name>".
>
> Also I think the third to last sentence should end with "... does not
> take **an** argument" or "... does not take argument**s**".
>
>> +CREATE FUNCTION pg_buffercache_evict_relation(
>> + IN regclass,
>> + IN fork text default 'main',
>> + OUT buffers_evicted int4,
>> + OUT buffers_flushed int4)
>> +AS 'MODULE_PATHNAME', 'pg_buffercache_evict_relation'
>> +LANGUAGE C PARALLEL SAFE VOLATILE;
>> +
>> +CREATE FUNCTION pg_buffercache_evict_all(
>> + OUT buffers_evicted int4,
>> + OUT buffers_flushed int4)
>> +AS 'MODULE_PATHNAME', 'pg_buffercache_evict_all'
>> +LANGUAGE C PARALLEL SAFE VOLATILE;
>
> Does it make sense to also update pg_buffercache_evict to also return
> a bool if the buffer was flushed? Or is that too much of a breaking
> change?
>
>> + /* Lock the header and check if it's valid. */
>> + buf_state = LockBufHdr(desc);
>> + if ((buf_state & BM_VALID) == 0)
>> + {
>> + UnlockBufHdr(desc, buf_state);
>> + return false;
>> + }
>
> EvictUnpinnedBuffer first checks if the buffer is locked before
> attempting to lock it. Do we need a similar check here?
>
> /* Lock the header if it is not already locked. */
> if (!buf_state)
> buf_state = LockBufHdr(desc);
>
>>> I don't think *flushed is necessarily accurate this way, as it
> might detect
>>> that it doesn't need to flush the data (via StartBufferIO()
> returning false).
>>
>> You are right. It seems there is no way to get that information
>> without editing the FlushBuffer(), right? And I think editing
>> FlushBuffer() is not worth it. So, I can either remove it or explain
>> in the docs that #buffers_flushed may not be completely accurate.
>
> There's already a lot of caveats on EvictUnpinnedBuffer that it might
> have unexpected results when run concurrently, so I think it's fine to
> add one more.
>
> Thanks,
> Joe Koshakow
I agree with most of what Joseph said. However, I would like to add some
comments.
At the moment, the "flushed" flag essentially indicates whether the
buffer
was dirty at the time of eviction and it does not guarantee that it has
been
written to disk. Therefore, it would be better to rename the
buffers_flushed
column in the output of pg_buffer_cache_evict_all() and
pg_buffercache_evict_relation() functions to dirty_buffers mb? This
would
allow us to avoid the confusion that arises from the fact that not all
dirty
buffers could have actually been written to disk. In addition, this
would
remove the "flushed" parameter from the EvictUnpinnedBuffer() function.
Because if we explicitly call LockBufHdr() outside of
EvictUnpinnedBuffer(),
we can already know in advance whether the buffer is dirty or not.
The same applies to the suggestion to retrieve "flushed" count from the
pg_buffercache_evict() call. We cannot say this for certain, but we can
determine whether the buffer was dirty.
regards,
Aidar Imamov
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2025-03-23 21:16:18 | Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment |
Previous Message | Tom Lane | 2025-03-23 19:10:02 | Re: Add Postgres module info |