Re: MaxOffsetNumber for Table AMs

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MaxOffsetNumber for Table AMs
Date: 2021-04-30 18:23:02
Message-ID: CA+TgmoaeqdneNd+7Ym_5_xkx0OUn_L6=3beHw=SOGh_nEtyK2g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 30, 2021 at 2:05 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> I agree in principle, but making that work well is very hard in
> practice because of the format of IndexTuple -- which bleeds into
> everything. That TID is special is probably a natural consequence of
> the fact that we don't have an offset-based format of the kind you see
> in other DB systems -- systems that don't emphasize extensibility. We
> cannot jump to a hypothetical TID attribute inexpensively inside code
> like _bt_compare() because we don't have a cheap way to jump straight
> to the datum for any attribute. So we just store TID in IndexTuple
> directly instead. Imagine how much more expensive VACUUM would be if
> it had to grovel through the IndexTuple format.

I can't imagine that, so maybe you want to enlighten me? I see that
there's a potential problem there, and I'm glad you pointed it out
because I hadn't thought about it previously ... but if you always put
the column or columns that VACUUM would need first, it's not obvious
to me that it would be all that expensive. Deforming the tuple to a
sufficient degree to extract the first column, which would even be
fixed-width, shouldn't take much work.

> I wonder how the same useful performance characteristics can be
> maintained with a variable-width TID design. If you solve the problem
> by changing IndexTuple, then you are kind of obligated to not use
> varlena headers to keep the on-disk size manageable. Who knows where
> it all ends?

What's wrong with varlena headers? It would end up being a 1-byte
header in practically every case, and no variable-width representation
can do without a length word of some sort. I'm not saying varlena is
as efficient as some new design could hypothetically be, but it
doesn't seem like it'd be a big enough problem to stress about. If you
used a variable-width representation for integers, you might actually
save bytes in a lot of cases. An awful lot of the TIDs people store in
practice probably contain several zero bytes, and if we make them
wider, that's going to be even more true.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2021-04-30 18:28:31 Re: MaxOffsetNumber for Table AMs
Previous Message Peter Geoghegan 2021-04-30 18:22:54 Re: MaxOffsetNumber for Table AMs