Re: RFC: Key normalization for nbtree

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Claudio Freire <klaussfreire(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: RFC: Key normalization for nbtree
Date: 2017-07-10 20:36:29
Message-ID: CAH2-Wzkk08LOv8iniEo3sRpmEz8Pb9k3NDb3raioNszu7xTWow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 10, 2017 at 1:23 PM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> Claudio Freire wrote:
>
>> A missing optimization is that having tid unification allows VACUUM to
>> implement a different strategy when it needs to clean up only a tiny
>> fraction of the index. It can do the lookup by key-tid instead of
>> scanning the whole index, which can be a win if the index is large and
>> the number of index pointers to kill is small.
>
> Doing index cleanup by using keys instead of scanning the whole index
> would be a *huge* win for many use cases. I don't think that work needs
> to be related to either of these patches.

The problem is that it will perform horribly when there are many
duplicates, unless you really can treat heap TID as a part of the key.
Once you do that, you can be almost certain that you won't have to
visit more than one leaf page per index tuple to kill. And, you can
amortize descending the index among a set of duplicate keyed tuples.
You arrive at the leaf tuple with the lowest sort order, based on
(key, TID), kill that, and then continue an index scan along the leaf
level. VACUUM ticks them off a private "kill list" which is also
ordered by (key, TID). Much less random I/O, and pretty good
guarantees about the worst case.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2017-07-10 20:43:17 Re: RFC: Key normalization for nbtree
Previous Message Alvaro Herrera 2017-07-10 20:23:46 Re: RFC: Key normalization for nbtree