Re: Streaming I/O, vectored I/O (WIP)

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Streaming I/O, vectored I/O (WIP)
Date: 2024-03-19 15:04:53
Message-ID: 56bc4ac4-9a9e-4f28-af1e-b6d537bd9a5e@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Some quick comments:

On 12/03/2024 15:02, Thomas Munro wrote:
> src/backend/storage/aio/streaming_read.c
> src/include/storage/streaming_read.h

Standard file header comments missing.

It would be nice to have a comment at the top of streaming_read.c,
explaining at a high level how the circular buffer, lookahead and all
that works. Maybe even some diagrams.

For example, what is head and what is tail? Before reading the code, I
assumed that 'head' was the next block range to return in
pg_streaming_read_buffer_get_next(). But I think it's actually the other
way round?

> /*
> * Create a new streaming read object that can be used to perform the
> * equivalent of a series of ReadBuffer() calls for one fork of one relation.
> * Internally, it generates larger vectored reads where possible by looking
> * ahead.
> */
> PgStreamingRead *
> pg_streaming_read_buffer_alloc(int flags,
> void *pgsr_private,
> size_t per_buffer_data_size,
> BufferAccessStrategy strategy,
> BufferManagerRelation bmr,
> ForkNumber forknum,
> PgStreamingReadBufferCB next_block_cb)

I'm not a fan of the name, especially the 'alloc' part. Yeah, most of
the work it does is memory allocation. But I'd suggest something like
'pg_streaming_read_begin' instead.

Do we really need the pg_ prefix in these?

> Buffer
> pg_streaming_read_buffer_get_next(PgStreamingRead *pgsr, void **per_buffer_data)

Maybe 'pg_streaming_read_next_buffer' or just 'pg_streaming_read_next',
for a shorter name.

>
> /*
> * pgsr->ranges is a circular buffer. When it is empty, head == tail.
> * When it is full, there is an empty element between head and tail. Head
> * can also be empty (nblocks == 0), therefore we need two extra elements
> * for non-occupied ranges, on top of max_pinned_buffers to allow for the
> * maxmimum possible number of occupied ranges of the smallest possible
> * size of one.
> */
> size = max_pinned_buffers + 2;

I didn't understand this explanation for why it's + 2.

> /*
> * Skip the initial ramp-up phase if the caller says we're going to be
> * reading the whole relation. This way we start out doing full-sized
> * reads.
> */
> if (flags & PGSR_FLAG_FULL)
> pgsr->distance = Min(MAX_BUFFERS_PER_TRANSFER, pgsr->max_pinned_buffers);
> else
> pgsr->distance = 1;

Should this be "Max(MAX_BUFFERS_PER_TRANSFER,
pgsr->max_pinned_buffers)"? max_pinned_buffers cannot be smaller than
MAX_BUFFERS_PER_TRANSFER though, given how it's initialized earlier. So
perhaps just 'pgsr->distance = pgsr->max_pinned_buffers' ?

--
Heikki Linnakangas
Neon (https://neon.tech)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2024-03-19 15:15:23 Re: Read data from Postgres table pages
Previous Message Chapman Flack 2024-03-19 15:02:38 Re: Java : Postgres double precession issue with different data format text and binary