From: | Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> |
---|---|
To: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
Cc: | Kirill Reshke <reshkekirill(at)gmail(dot)com>, Matheus Alcantara <mths(dot)dev(at)pm(dot)me>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Using read stream in autoprewarm |
Date: | 2025-04-02 10:26:42 |
Message-ID: | CAN55FZ2ZSBLvfSF0wMAwNAkjZO6sMZrDhAKB2vZsBG0DnrVV=A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On Wed, 2 Apr 2025 at 01:36, Melanie Plageman <melanieplageman(at)gmail(dot)com> wrote:
>
> On Tue, Apr 1, 2025 at 8:50 AM Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> wrote:
> >
> > I am attaching v8, which is an updated version of the v7. I tried to
> > get rid of these local variables and refactored code to make logic
> > more straightforward instead of going back and forth.
> >
> > 0001 and 0002 are v8. 0003 is another refactoring attempt to make code
> > more straightforward. I did not squash 0003 to previous patches as you
> > might not like it.
>
> I looked at the code on your github branch that has all three of these
> squashed together.
Thank you!
> I think our approaches are converging. I like that you are
> fast-forwarding to the next filenumber or fork number explicitly when
> there is a bad relation or fork. I've changed my version (see newest
> one attached) to do the fast-forwarding inline instead of in a
> separate function like yours (the function didn't save many LOC and
> actually may have added to cognitive overhead).
>
> Compared to my version, I think you avoided one level of loop nesting with your
>
> if (!rel)
> else if (smgrexists(RelationGetSmgr(rel), blk->forknum))
> else
>
> but for starters, I don't think you can do this:
>
> else if (smgrexists(RelationGetSmgr(rel), blk->forknum))
>
> because you didn't check if you have a legal forknum first
You are right, I missed that. I think smgrexists() should return NULL
if the forknum is invalid but it is not a topic for this thread.
> And, I actually kind of prefer the explicitly nested structure
>
> loop through all relations
> loop through all forks
> loop through all buffers
I prefer this as well. We know when we opened the relation, so we do
not need to close it in two places like I did.
> While in the old structure, I liked your
> autoprewarm_prewarm_relation() function, but I think it is nicer
> inlined like in my version. It makes the loop through all buffers
> explicit too.
Yes, I liked your approach.
> I know you mentioned off-list that you don't like the handling of
> global objects in my version, but I prefer doing it this way (even
> though we have to check for in the loop condition) to having to set
> the current database once we reach non-shared objects. It feels too
> fiddly. This way seems less error prone. Looking at this version, what
> do you think? Could we do it better?
I think there might be a problem with that approach. Let's say that we
are able to open relation when database oid = 0 and filenumber = 18.
Then we are trying to find a valid fork now. We couldn't find a valid
fork immediately, so we continued looping. Then database oid is
changed from 0 to let's say 1 but filenumber remains the same. We are
still in the valid fork loop, so relation remains from the database
oid = 0. Isn't that wrong?
> Let me know what you think of this version. I think it is the best of
> both our approaches. I've separated it into two commits -- the first
> does all the refactoring without using the read stream API and the
> second one uses the read stream API.
Some comments,
0001:
ReadBufferExtended() can be called in its own minimal loop, otherwise
we end up doing unnecessary checks for each ReadBufferExtended() call.
This is not a problem when the 0002 is applied.
0002:
We don't skip blocks whose blocknum is more than nblocks_in_fork. We
can add that either to a stream callback like you did before or after
the read_stream_end. I prefer stream callback because of the reason
below [1].
> On another topic, what are the minimal places we need to call
> have_free_buffers() (in this version)? I haven't even started looking
> at the last patch you've been sending that is about checking the
> freelist. I'll have to do that next.
I think its current places are good enough. We may add one after the
read_stream_end if we want to handle blk->blocknum >= nblocks_in_fork
after the read stream finishes. If we handle that in the stream
callback then no need to add have_free_buffers() [1].
Other than these comments, I think the current structure looks good.
--
Regards,
Nazir Bilal Yavuz
Microsoft
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2025-04-02 10:39:34 | Re: Fix 035_standby_logical_decoding.pl race conditions |
Previous Message | Ashutosh Bapat | 2025-04-02 10:06:33 | Re: Test to dump and restore objects left behind by regression |