Re: Streaming I/O, vectored I/O (WIP)

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Streaming I/O, vectored I/O (WIP)
Date: 2024-03-24 17:29:56
Message-ID: CAAKRu_bF=tT0R3TA9uZzAL=p_aG4mGvs1uBaUUG8kAaR0_qNVQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 24, 2024 at 9:02 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>
> On Wed, Mar 20, 2024 at 4:04 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> > On 12/03/2024 15:02, Thomas Munro wrote:
> > > src/backend/storage/aio/streaming_read.c
> > > src/include/storage/streaming_read.h
> >
> > Standard file header comments missing.
>
> Fixed.
>
> > It would be nice to have a comment at the top of streaming_read.c,
> > explaining at a high level how the circular buffer, lookahead and all
> > that works. Maybe even some diagrams.
>
> Done.
>
> > For example, what is head and what is tail? Before reading the code, I
> > assumed that 'head' was the next block range to return in
> > pg_streaming_read_buffer_get_next(). But I think it's actually the other
> > way round?
>
> Yeah. People seem to have different natural intuitions about head vs
> tail in this sort of thing, so I've switched to descriptive names:
> stream->{oldest,next}_buffer_index (see also below).
>
> > > /*
> > > * Create a new streaming read object that can be used to perform the
> > > * equivalent of a series of ReadBuffer() calls for one fork of one relation.
> > > * Internally, it generates larger vectored reads where possible by looking
> > > * ahead.
> > > */
> > > PgStreamingRead *
> > > pg_streaming_read_buffer_alloc(int flags,
> > > void *pgsr_private,
> > > size_t per_buffer_data_size,
> > > BufferAccessStrategy strategy,
> > > BufferManagerRelation bmr,
> > > ForkNumber forknum,
> > > PgStreamingReadBufferCB next_block_cb)
> >
> > I'm not a fan of the name, especially the 'alloc' part. Yeah, most of
> > the work it does is memory allocation. But I'd suggest something like
> > 'pg_streaming_read_begin' instead.
>
> I like it. Done.
>
> > Do we really need the pg_ prefix in these?
>
> Good question. My understanding of our convention is that pg_ is
> needed for local replacements/variants/extensions of things with well
> known names (pg_locale_t, pg_strdup(), yada yada), and perhaps also in
> a few places where the word is very common/short and we want to avoid
> collisions and make sure it's obviously ours (pg_popcount?), and I
> guess places that reflect the name of a SQL identifier with a prefix,
> but this doesn't seem to qualify for any of those things. It's a new
> thing, our own thing entirely, and sufficiently distinctive and
> unconfusable with standard stuff. So, prefix removed.
>
> Lots of other patches on top of this one are using "pgsr" as a
> variable name, ie containing that prefix; perhaps they would use "sr"
> or "streaming_read" or "stream". I used "stream" in a few places in
> this version.
>
> Other names improved in this version IMHO: pgsr_private ->
> callback_private. I find it clearer, as a way to indicate that the
> provider of the callback "owns" it. I also reordered the arguments:
> now it's streaming_read_buffer_begin(..., callback, callback_private,
> per_buffer_data_size), to keep those three things together.

I haven't reviewed the whole patch, but as I was rebasing
bitmapheapscan streaming read user, I found callback_private confusing
because it seems like it is a private callback, not private data
belonging to the callback. Perhaps call it callback_private_data? Also
maybe mention what it is for in the comment above
streaming_read_buffer_begin() and in the StreamingRead structure
itself.

- Melanie

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Melanie Plageman 2024-03-24 17:38:33 Re: BitmapHeapScan streaming read user and prelim refactoring
Previous Message Tom Lane 2024-03-24 16:53:36 Re: [PoC] Improve dead tuple storage for lazy vacuum