From: | Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> |
---|---|
To: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
Cc: | Kirill Reshke <reshkekirill(at)gmail(dot)com>, Matheus Alcantara <mths(dot)dev(at)pm(dot)me>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Using read stream in autoprewarm |
Date: | 2025-03-30 14:00:45 |
Message-ID: | CAN55FZ32QnQM41RMRU7a8T6A7anycWPr9PbK0dozOfgz9SxRPA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
Thank you for looking into this!
On Sat, 29 Mar 2025 at 23:10, Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
>
> On Fri, Nov 29, 2024 at 6:20 AM Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> wrote:
> >
> > v4 is attached.
>
> I've started looking at this version. What I don't like about this
> structure is that we are forced to increment the position in the array
> of BlockInfoRecords in both the callback and the main loop in
> autoprewarm_database_main(). There isn't a way around it because we
> have to return control to the user when we encounter a new relation
> (we can't close the relation and destroy the read stream in the
> callback). And in the main loop in autoprewarm_database_main(), we may
> fail to open the next relation and then need to keep advancing the
> position in the array of BlockInfoRecords.
>
> It isn't just that we have to advance the position in both places --
> we also have to have a special case for the first block. All in all,
> given that in the current read stream API, a single stream must only
> concern itself with a single relation, fork combo, I think there is no
> elegant way to deal with this in autoprewarm.
>
> One alternative is to loop through the array of BlockInfoRecords and
> get the start and end positions of the blocks in the arary for a
> single relation/fork combo. Then we could make the read stream and
> pass those two positions and the array as callback_private_data. That
> would mean we loop through the whole array twice, but I wonder if the
> improvement in clarity is worth it?
I think this is a good alternative. I will work on this and try to
propose a patch.
> Some review feedback on your v4: I don't think we need the
> rs_have_free_buffer per_buffer_data. We can just check
> have_free_buffers() in both the callback and main loop in
> autoprewarm_database_main(). I also think you want a comment about why
> first_block is needed. And I think you need to guard the
> read_stream_end() after the loop -- what if we never made a read
> stream because we errored out for all the block's relations or
> something?
All of these are addressed. One extra thing I noticed is we were not
checking if blocknum < number_of_block_in_relation at the first_block
case in the stream callback, this is fixed now.
--
Regards,
Nazir Bilal Yavuz
Microsoft
Attachment | Content-Type | Size |
---|---|---|
v5-0001-Optimize-autoprewarm-with-read-streams.patch | application/octet-stream | 6.0 KB |
v5-0002-Count-free-buffers-at-the-start-of-the-autoprewar.patch | application/octet-stream | 2.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Magnus Hagander | 2025-03-30 15:03:06 | Re: pg_stat_database.checksum_failures vs shared relations |
Previous Message | jian he | 2025-03-30 13:53:22 | Re: speedup COPY TO for partitioned table. |