Re: Using read stream in autoprewarm

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Cc: Kirill Reshke <reshkekirill(at)gmail(dot)com>, Matheus Alcantara <mths(dot)dev(at)pm(dot)me>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Using read stream in autoprewarm
Date: 2025-04-01 02:14:07
Message-ID: CAAKRu_Y5GOcwJ+Nzv553oEJ3nA92ceWNi0N=HY5zme1wCiXDxw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 31, 2025 at 3:45 PM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
>
> Whoops, this isn't right. It does work. I'm going to draft a version
> suggesting slightly different variable naming and a couple comments to
> make this more clear.

Okay, so after further study, I think there are multiple issues still
with the code. We could end up comparing a blocknumber to nblocks
calculated from a different fork. To address this, you'll need to keep
track of the last fork_number. At that point, you kind of have to
bring back old_blk -- because that is what we are recreating with
multiple local variables.

But, I think, overall, what we actually want to do is actually be
really explicit about fast-forwarding in the failure cases (when we
want to skip ahead because a relation is invalid or a fork is
invalid). We were trying to use the main loop control and just add
special cases to allow us to do this fast-forwarding. But, I think
instead, we want to just go to a function or loop somewhere and fast
forward through those bad blocks until we get to the next run of
blocks from a different relation or fork.

I've sketched out an idea like this in the attached. I don't think it
is 100% correct. It does pass tests, but I think we might incorrectly
advance pos twice after skipping a run of blocks belonging to a bad
fork or relation -- and thus skip the first good block after some bad
blocks.

It also needs some more refactoring.

maybe instead of having the skip code like this
skip_forknumber:;
while ((blk = next_record(block_info, &i)) != NULL &&
blk->database == database && blk->filenumber == filenumber &&
blk->forknum == forknum);

we make it a function? to avoid the back-to-back while loop conditions
(because of the outer do while loop).

But the explicit looping for skipping the bad blocks and the nested
loops for rel and fork -- I think these are less error prone.

What do you think?

- Melanie

Attachment Content-Type Size
pgsr-autoprewarm-nested-loops.patch text/x-patch 7.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Junwang Zhao 2025-04-01 02:27:24 Re: general purpose array_sort
Previous Message Peter Geoghegan 2025-04-01 02:02:25 Re: Adding skip scan (including MDAM style range skip scan) to nbtree