Re: Table AM Interface Enhancements

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Table AM Interface Enhancements
Date: 2024-04-15 19:09:47
Message-ID: CA+TgmoZNekJgFtAfAvqDNdwutoSureckmnG5fEbL-6a1+JwY0A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 15, 2024 at 12:41 PM Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> wrote:
> I agree with you but I did not understand one thing. If out-of-core
> AMs are used, does not all block sampling logic (BlockSampler_Init(),
> BlockSampler_Next() etc.) need to be edited as well since these
> functions assume block numbers are actual physical on-disk location,
> right? I mean if the block number is something different than the
> actual physical on-disk location, the acquire_sample_rows() function
> looks wrong to me before c6fc50cb4028 as well.

Yes, this is also a problem with trying to use non-physical block
numbers. We can hypothesize an AM where it works out OK in practice,
say because there are always exactly the same number of logical block
numbers as there are physical block numbers. Or, because there are
always more logical block numbers than physical block numbers, but for
some reason the table AM author doesn't care because they believe that
in the target use case for their AM the data distribution will be
sufficiently uniform that sampling only low-numbered blocks won't
really hurt anything.

But that does seem a bit strained. In practice, I suspect that table
AMs that use logical block numbers might want to replace this line
from acquire_sample_rows() with a call to a tableam method that
returns the number of logical blocks:

totalblocks = RelationGetNumberOfBlocks(onerel);

But even that does not seem like enough, because my guess would be
that a lot of table AMs would end up with a sparse logical block
space. For instance, you might create a logical block number sequence
that starts at 0 and just counts up towards 2^32 and eventually either
wraps around or errors out. Each new tuple gets the next TID that
isn't yet used. Well, what's going to happen eventually in a lot of
workloads is that the low-numbered logical blocks are going to be
mostly or entirely empty, and the data is going to be clustered in the
ones that are nearer to the highest logical block number that's so far
been assigned. So, then, as you say, you'd want to replace the whole
BlockSampler thing entirely.

That said, I find it a little bit hard to know what people are already
doing or realistically might try to do with table AMs. If somebody
says they have a table AM where the number of logical block numbers
equals the number of physical block numbers (or is somewhat larger but
in a way that doesn't really matter) and the existing block sampling
logic works well enough, I can't really disprove that. It puts awfully
tight limits on what the AM can be doing, but, OK, sometimes people
want to develop AMs for very specific purposes. However, because of
the prefetching thing, I think even that fairly narrow use case was
already broken before 041b96802efa33d2bc9456f2ad946976b92b5ae1. So I
just don't really see how that commit made anything worse in any way
that really matters.

But maybe it did. People often find extremely creative ways of working
around the limitations of the core interfaces. I think it could be the
case that someone found a clever way of dodging all of these problems
and had something that was working well enough that they were happy
with it, and now they can't make it work after the changes for some
reason. If that someone is reading this thread and wants to spell that
out, we can consider whether there's some relief that we could give to
that person, *especially* if they can demonstrate that they raised the
alarm before the commit went in. But in the absence of that, my
current belief is that nonphysical block numbers were never a
supported scenario; hence, the idea that
041b96802efa33d2bc9456f2ad946976b92b5ae1 should be reverted for
de-supporting them ought to be rejected.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2024-04-15 19:11:44 Re: SET ROLE documentation improvement
Previous Message Andres Freund 2024-04-15 18:57:20 Removing GlobalVisTestNonRemovableHorizon