Re: Using read stream in autoprewarm

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Cc: Kirill Reshke <reshkekirill(at)gmail(dot)com>, Matheus Alcantara <mths(dot)dev(at)pm(dot)me>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Using read stream in autoprewarm
Date: 2025-03-31 19:27:54
Message-ID: CAAKRu_Y2Ai-uQtSEfcP2MKq12gtEU-oh-w1JkxxVofd3ijLaSw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 31, 2025 at 2:58 PM Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> wrote:
>
> Do you think that I should continue to
> attach both approaches?

No, for now let's try and get this approach to a good place and then
see which one we like.

I think there might be another problem with the code. We only set
cur_database in the loop in autoprewarm_databas_main() when it is 0

if (cur_database != blk->database)
{
if (cur_database == 0)
cur_database = blk->database;

I know that the read stream will return InvalidBlockNumber when we
move onto the next database, but I don't see how we will end up
actually stopping prewarming in autoprewarm_database_main() when we
move on to the next database.

Another thing:
I don't know if it is a correctness issue but in
autoprewarm_database_main(), in this case
if (!rel && cur_filenumber != blk->filenumber)
{
you have removed the Assert(rel == NULL) -- I worry that we will end
up with a rel from a previous iteration after failing to open th enext
rel. I think we want this assert.

And a last thing
I noticed is that the initial values for cur_database, cur_filenumber,
and nblocks_in_fork in autoprewarm_database_main() are all initialized
to different kinds of initial values for different reasons. I'm
thinking if there is a way to make it consistent.

cur_database = block_info[pos].database;
cur_filenumber = InvalidOid;
nblocks_in_fork = InvalidBlockNumber;

cur_database is set to be the same as the first database in the array
so that we won't hit
if (cur_database != blk->database)
on the first block.

However, we make cur_filenumber InvalidOid because for the first block
we want to hit code that forces us to open a new relation
if (!rel && cur_filenumber != blk->filenumber)

And nblocks_in_fork to InvalidBlockNumber so that 1) we don't have to
get the number before starting the loop and 2) so we would move past
BlockInfoRecords with invalid filenumber and invalid blocknumber

if (cur_filenumber == blk->filenumber &&
blk->blocknum >= nblocks_in_fork)

So, I'm just thinking if it would be correct to initialize
cur_database to InvalidOid and to check for that before skipping a
block, or if that doesn't work when the first blocks' database is
InvalidOid.

- Melanie

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2025-03-31 19:35:23 Re: Make COPY format extendable: Extract COPY TO format implementations
Previous Message Robert Haas 2025-03-31 19:24:51 Re: RFC: Logging plan of the running query