Re: Use streaming read API in ANALYZE

From: Mats Kindahl <mats(at)timescale(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, jakub(dot)wartak(at)enterprisedb(dot)com
Subject: Re: Use streaming read API in ANALYZE
Date: 2024-09-10 17:07:18
Message-ID: CA+14425G7wOHKO_gttGk1eiVeAKNtPoTBj9D0CJFD=GUzKhkVg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 5, 2024 at 11:12 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:

> On Thu, Sep 5, 2024 at 6:45 PM Mats Kindahl <mats(at)timescale(dot)com> wrote:
> > Forgive me for asking, but I am not entirely sure why the ReadStream
> struct is opaque. The usual reasons are:
> >
> > You want to provide an ABI to allow extensions to work with new major
> versions without re-compiling. Right now it is necessary to recompile
> extensions anyway, this does not seem to apply. (Because there are a lot of
> other changes that you need when switching versions because of the lack of
> a stable ABI for other parts of the code. However, it might be that the
> goal is to support it eventually, and then it would make sense to start
> making structs opaque.)
> > You want to ensure that you can make modifications inside a major
> version without breaking ABIs and requiring a re-compile. In this case, you
> could still follow safe practice of adding new fields last, not relying on
> the size of the struct for anything (e.g., no arrays of these structures,
> just pointers to them), etc. However, if you want to be very safe and
> support very drastic changes inside a major version, it needs to be opaque,
> so this could be the reason.
> >
> > Is it either of these reasons, or is there another reason?
> >
> > Making the ReadStream API non-opaque (that is, moving the definition to
> the header file) would at least solve our problem (unless I am mistaken).
> However, I am ignorant about long-term plans which might affect this, so
> there might be a good reason to revert it for reasons I am not aware of.
>
> The second thing. Also there are very active plans[1] to change the
> internal design of ReadStream in 18, since the goal is to drive true
> asynchronous I/O, and the idea of ReadStream was to create a simple
> API to let many consumers start using it, so that we can drive
> efficient modern system interfaces below that API, so having people
> depending on how it works would not be great.
>

That is understandable, since you usually do not want to have to re-compile
the extension for different minor versions. However, it would be a rare
case with extensions that are meddling with this, so might not turn out to
be a big problem in reality, as long as it is very clear to all involved
that this might change and that you make an effort to avoid binary
incompatibility by removing or changing types for fields.

> But let's talk about how that would actually look, for example if we
> exposed the struct or you took a photocopy of it... I think your idea
> must be something like: if you could access struct ReadStream's
> internals, you could replace stream->callback with an interceptor
> callback, and if the BlockSampler had been given the fake N + M
> relation size, the interceptor could overwrite
> stream->ios[next_io_index].op.smgr and return x - N if the intercepted
> callback returned x >= N. (Small detail: need to check
> stream->fast_path and use 0 instead or something like that, but maybe
> we could change that.)

Yes, this is what I had in mind, but I did not dig too deeply into the code.

> One minor problem that jumps out is that
> read_stream.c could inappropriately merge blocks from the two
> relations into one I/O. Hmm, I guess you'd have to teach the
> interceptor not to allow that: if switching between the two relation,
> and if the block number would coincide with
> stream->pending_read_blocknum + stream->pending_read_nblocks, it would
> need to pick a new block instead (interfering with the block sampling
> algorithm, but only very rarely). Is this what you had in mind, or
> something else?
>

Hmmm... I didn't look too closely at this. Since the block number comes
from the callback, I guess we could make sure to have a "padding" block
between the regions so that we "break" any suite of blocks, which I think
is what you mean with "teach the interceptor not to allow that", but I
would have to write a patch to make sure.

>
> (BTW I have a patch to teach read_stream.c about multi-smgr-relation
> streams, by adding a different constructor with a different callback
> that returns smgr, fork, block instead of just the block, but it
> didn't make it into 17.)
>

Without having looked at the patch, this sounds like the correct way to do
it.

>
> [1]
> https://www.postgresql.org/message-id/flat/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah(at)brqs62irg4dt
>

--
Best wishes,
Mats Kindahl, Timescale

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2024-09-10 17:24:20 Re: pg_verifybackup: TAR format backup verification
Previous Message Andres Freund 2024-09-10 16:59:03 Re: Refactoring postmaster's code to cleanup after child exit