Re: pgsql: Add deduplication to nbtree.

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-committers <pgsql-committers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgsql: Add deduplication to nbtree.
Date: 2020-03-01 23:01:55
Message-ID: CAH2-WzkPR8KMwkL0ap976kmXwBCeukTeHz6fB-U__wvuP1S9Zg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

On Sun, Mar 1, 2020 at 2:14 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > Attached patch shows how this could work. I prefer my original
> > approach, but I can see the argument for doing it this way.
>
> This does seem a bit duplicative ... and shouldn't both code paths
> include a final "Assert(d == vacposting->ndeletedtids)"?

No, because the "nhtids == 1" loop has a "break" when the first and
only TID that we need to keep around is hit.

> Another thing that maybe bears closer scrutiny is the size calculation.
> It seems a bit confused as to whether the offset of the posting list
> within the tuple, or the total tuple size, or both, needs to be
> MAXALIGN'd.

That's not quite true in my opinion. BTreeTupleSetPosting() has an
Assert() of its own about alignment. ISTM that it's reasonable for
_bt_update_posting() to assume that BTreeTupleGetPostingOffset() will
return a MAXALIGN()'d offset. Note also that _bt_form_posting() is
explicit about what it expects around alignment. This is
_bt_update_posting()'s sibling function. The _bt_update_posting()
calculation is explicitly documented as being derived from the one in
_bt_form_posting().

I am happy to add parallel-to-_bt_form_posting() assertions about
alignment to _bt_form_posting(), to nail it down completely. Plus I'll
add the assertion I suggested already. Once I commit a patch with
these two new assertions, I think that we can consider the matter
resolved. Does that seem reasonable to you?

--
Peter Geoghegan

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Peter Geoghegan 2020-03-02 00:09:37 Re: pgsql: Add deduplication to nbtree.
Previous Message Tom Lane 2020-03-01 22:14:03 Re: pgsql: Add deduplication to nbtree.