Re: Confine vacuum skip logic to lazy_scan_skip

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, Noah Misch <noah(at)leadboat(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>
Subject: Re: Confine vacuum skip logic to lazy_scan_skip
Date: 2025-02-28 02:13:46
Message-ID: CA+hUKGLa7ba7USyT+JR7uRiawWeCVJ96wyRsoEXk7r2gngPv=A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 28, 2025 at 2:29 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> On Fri, Feb 28, 2025 at 11:58 AM Melanie Plageman
> <melanieplageman(at)gmail(dot)com> wrote:
> > On Thu, Feb 27, 2025 at 1:08 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > > I wonder if it'd be a good idea to add something like
> > >
> > > Assert(stream->distance == 1);
> > > Assert(stream->pending_read_nblocks == 0);
> > > Assert(stream->per_buffer_data_size == 0);
> > > + Assert(per_buffer_data == NULL);
> > >
> > > in read_stream_next_buffer. I doubt that this will shut Coverity
> > > up, but it would help to catch caller coding errors, i.e. passing
> > > a per_buffer_data pointer when there's no per-buffer data.
> >
> > I think this is a good stopgap. I was discussing adding this assert
> > off-list with Thomas and he wanted to detail his more ambitious plans
> > for type safety improvements in the read stream API. Less on the order
> > of a redesign and more like a separate read_stream_next_buffer()s for
> > when there is per buffer data and when there isn't. And a by-value and
> > by-reference version for the one where there is data.
>
> Here's what I had in mind. Is it better?

Here's a slightly better one. I think when you use
read_stream_get_buffer_and_value(stream, &value), or
read_stream_put_value(stream, space, value), then we should assert
that sizeof(value) strictly matches the available space, as shown. But,
new in v2, if you use read_stream_get_buffer_and_pointer(stream,
&pointer), then sizeof(*pointer) should only have to be <= the
storage space, not ==, because someone might plausibly want to make
per_buffer_data_size variable at runtime (ie decide when they
construct the stream), and then be able to retrieve a pointer to the
start of a struct with a flexible array or something like that. In v1
I was just trying to assert that it was a
pointer-to-a-pointer-to-something and no more (in a confusing
compile-time assertion), but v2 is simpler, and is happy with a
pointer to a pointer to something that doesn't exceed the space
(run-time assertion).

Attachment Content-Type Size
v2-0001-Improve-API-for-retrieving-data-from-read-streams.patch application/x-patch 9.7 KB
v2-0002-Improve-API-for-storing-data-in-read-streams.patch application/x-patch 2.9 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2025-02-28 02:32:20 Re: Statistics Import and Export
Previous Message Peter Geoghegan 2025-02-28 01:53:19 Re: Why doesn't GiST VACUUM require a super-exclusive lock, like nbtree VACUUM?