From: | Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Aidar Imamov <a(dot)imamov(at)postgrespro(dot)ru>, Joseph Koshakow <koshy44(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Add pg_buffercache_evict_all() and pg_buffercache_mark_dirty[_all]() functions |
Date: | 2025-04-04 15:36:57 |
Message-ID: | CAN55FZ1P1M1Q6vt11qBUW=CCFKP6p8yBnK9BoqJs2m9Tu2YLhw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
Thanks for the review!
On Wed, 2 Apr 2025 at 20:06, Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> On 2025-03-31 19:49:25 +0300, Nazir Bilal Yavuz wrote:
> > > After this discussion, I think it would be helpful if one of the more
> > > experienced
> > > hackers could take a look at the overall picture (perhaps we should set
> > > the
> > > status "Ready for Committer"? To be honest, I'm not sure what step that
> > > status
> > > should be set at).
> >
> > I agree. I will mark this as 'ready for committer' after writing the tests.
>
> Are you still working on tests?
Sorry, I forgot. They are attached as 0004.
> > > Also, there's a little thing about declaring functions as PARALLEL SAFE.
> > > To be honest,
> > > I don't really have any solid arguments against it. I just have some
> > > doubts. For
> > > example, how will it work if the plan is split up and we try to work on
> > > an object in
> > > one part, while the other part of the plan evicts the pages of that
> > > object or marks
> > > them as dirty... I can't really say for sure about that. And in that
> > > context, I'd
> > > suggest referring to that awesome statement in the documentation: "If in
> > > doubt,
> > > functions should be labeled as UNSAFE, which is the default."
> >
> > You may be right. I thought they are parallel safe as we acquire the
> > buffer header lock for each buffer but now, I have the same doubts as
> > you. I want to hear other people's opinions on that before labeling
> > them as UNSAFE.
>
> I don't see a problem with them being parallel safe.
>
>
> > From a59be4b50d7691ba952d0abd32198e972d4a6ea0 Mon Sep 17 00:00:00 2001
> > From: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
> > Date: Mon, 24 Mar 2025 13:52:04 +0300
> > Subject: [PATCH v5 1/3] Add pg_buffercache_evict_[relation | all]() functions
> > for testing
> >
> > pg_buffercache_evict_relation(): Evicts all shared buffers in a
> > relation at once.
> > pg_buffercache_evict_all(): Evicts all shared buffers at once.
> >
> > Both functions provide mechanism to evict multiple shared buffers at
> > once. They are designed to address the inefficiency of repeatedly calling
> > pg_buffercache_evict() for each individual buffer, which can be
> > time-consuming when dealing with large shared buffer pools.
> > (e.g., ~790ms vs. ~270ms for 16GB of shared buffers).
> >
> > These functions are intended for developer testing and debugging
> > purposes and are available to superusers only.
>
>
> > + if (!superuser())
> > + ereport(ERROR,
> > + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> > + errmsg("must be superuser to use pg_buffercache_evict_relation function")));
>
> I think it'd be nicer if we didn't ask translators to translate 3 different
> versions of this message, for different functions. Why not pass in the functio
> name as a parameter?
Ah, I didn't think of this aspect, done.
> > + if (RelationUsesLocalBuffers(rel))
> > + ereport(ERROR,
> > + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > + errmsg("relation uses local buffers,"
> > + "pg_buffercache_evict_relation function is intended to"
> > + "be used for shared buffers only")));
>
> We try not to break messages across multiple lines, as that makes searching
> for them harder.
Done.
> > + EvictRelUnpinnedBuffers(rel, &buffers_evicted, &buffers_flushed);
> > +
> > + /* Close relation, release lock. */
>
> This comment imo isn't useful, it just restates the code verbatim.
Removed.
>
> > + relation_close(rel, AccessExclusiveLock);
>
> Hm. Why are we dropping the lock here early? It's probably ok, but it's not
> clear to me why we should do so.
We are dropping the lock after we processed the relation. I didn't
understand what could be the problem here. Why do you think it is
early?
> > + /* Build and return the tuple. */
>
> Similar to above, the comment is just a restatement of a short piece of code.
Done. It exists in other places in the pg_buffercache_pages.c file, I
only removed the ones that I added.
> > + <para>
> > + The <function>pg_buffercache_evict_relation()</function> function allows all
> > + shared buffers in the relation to be evicted from the buffer pool given a
> > + relation identifier. Use of this function is restricted to superusers only.
> > + </para>
>
> I'd say "all unpinned shared buffers".
Done.
> I wonder if these functions ought to also return the number of buffers that
> could not be evicted?
I didn't add this as I thought this information could be gathered if wanted.
> Should this, or the longer comment for the function, mention that buffers for
> all relation forks are evicted?
I added this to a longer comment for the function.
> > + <para>
> > + The <function>pg_buffercache_evict_all()</function> function allows all
> > + shared buffers to be evicted in the buffer pool. Use of this function is
> > + restricted to superusers only.
> > + </para>
>
> Basically the same comments as above, except for the fork portion.
Done.
> > diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
> > index f9681d09e1e..5d82e3fa297 100644
> > --- a/src/backend/storage/buffer/bufmgr.c
> > +++ b/src/backend/storage/buffer/bufmgr.c
> > @@ -6512,26 +6512,47 @@ ResOwnerPrintBufferPin(Datum res)
> > * even by the same block. This inherent raciness without other interlocking
> > * makes the function unsuitable for non-testing usage.
> > *
> > + * buf_state is used to check if the buffer header lock has been acquired
> > + * before calling this function. If buf_state has a non-zero value, it means
> > + * that the buffer header has already been locked. This information is useful
> > + * for evicting a specific relation's buffers, as the buffers' headers need to
> > + * be locked before this function can be called with such an intention.
>
> I don't like this aspect of the API changes one bit. IMO something callable
> from outside storage/buffer should have no business knowing about buf_state.
>
> And detecting whether already hold a lock by checking whether the buf_state is
> 0 feels really wrong to me.
I changed this. I basically copied EvictUnpinnedBuffer() to the inside
of EvictRelUnpinnedBuffers(), we don't need any hacky methods now.
> > +/*
> > + * Try to evict all the shared buffers containing provided relation's pages.
> > + *
> > + * This function is intended for testing/development use only!
> > + *
> > + * We need this helper function because of the following reason.
> > + * ReservePrivateRefCountEntry() needs to be called before acquiring the
> > + * buffer header lock but ReservePrivateRefCountEntry() is static and it would
> > + * be better to have it as static. Hence, it can't be called from outside of
> > + * this file. This helper function is created to bypass that problem.
>
> I think there are other reasons - we should minimize the amount of code
> outside of storage/buffer that needs to know about BuferDescs etc, it's a
> layering violation to access that outside.
I added that as a comment.
> > + * Before calling this function, it is important to acquire
> > + * AccessExclusiveLock on the specified relation to avoid replacing the
> > + * current block of this relation with another one during execution.
>
> What do you mean with "current block of this relation"?
It is used timewise, like a snapshot of the blocks when the function
is called. I updated this with:
'The caller must hold at least AccessShareLock on the relation to
prevent the relation from being dropped.' [1]
> > + * buffers_evicted and buffers_flushed are set the total number of buffers
> > + * evicted and flushed respectively.
> > + */
> > +void
> > +EvictRelUnpinnedBuffers(Relation rel, int32 *buffers_evicted, int32 *buffers_flushed)
> > +{
> > + bool flushed;
> > +
> > + Assert(buffers_evicted && buffers_flushed);
> > + *buffers_evicted = *buffers_flushed = 0;
>
> I'd personally not bother with the Assert, the next line would just crash
> anyway...
AIO tests already use EvictUnpinnedBuffer() without flushed
information. I made '*buffers_evicted, *buffers_flushed in the
EvictRelUnpinnedBuffers() and *flushed in the EvictUnpinnedBuffer()'
optional as Aidar suggested at upthread.
> > diff --git a/contrib/pg_buffercache/pg_buffercache--1.5--1.6.sql b/contrib/pg_buffercache/pg_buffercache--1.5--1.6.sql
> > index 2e255f3fc10..d54bb1fd6f8 100644
> > --- a/contrib/pg_buffercache/pg_buffercache--1.5--1.6.sql
> > +++ b/contrib/pg_buffercache/pg_buffercache--1.5--1.6.sql
> > @@ -1,5 +1,13 @@
> > \echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.6'" to load this file. \quit
> >
> > +DROP FUNCTION pg_buffercache_evict(integer);
> > +CREATE FUNCTION pg_buffercache_evict(
> > + IN int,
> > + OUT evicted boolean,
> > + OUT flushed boolean)
> > +AS 'MODULE_PATHNAME', 'pg_buffercache_evict'
> > +LANGUAGE C PARALLEL SAFE VOLATILE STRICT;
>
> I assume the function has to be dropped because the return type changes?
Correct.
v6 is attached. Additional changes prior to v5:
* Tests are added.
* Andres said off-list that AccessExclusiveLock is too much, all the
lock needs to do is to prevent the relation from being dropped. So,
pg_buffercache_evict_relation() opens a relation with AccessShareLock
now. Function comment in EvictRelUnpinnedBuffers() is updated
regarding that. [1]
* EvictRelUnpinnedBuffers() does not call EvictUnpinnedBuffer() now. I
basically copied EvictUnpinnedBuffer() to the inside of
EvictRelUnpinnedBuffers() with the needed updates.
--
Regards,
Nazir Bilal Yavuz
Microsoft
Attachment | Content-Type | Size |
---|---|---|
v6-0001-Return-buffer-flushed-information-in-pg_buffercac.patch | text/x-patch | 9.1 KB |
v6-0002-Add-pg_buffercache_evict_-relation-all-functions-.patch | text/x-patch | 13.1 KB |
v6-0003-Add-pg_buffercache_mark_dirty-_all-functions-for-.patch | text/x-patch | 9.5 KB |
v6-0004-Extend-pg_buffercache-extension-s-tests.patch | text/x-patch | 6.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Rahila Syed | 2025-04-04 15:45:34 | Re: Improve monitoring of shared memory allocations |
Previous Message | Peter Geoghegan | 2025-04-04 15:33:56 | Re: Replace IN VALUES with ANY in WHERE clauses during optimization |