Prefix compression of B-Tree keys

From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Prefix compression of B-Tree keys
Date: 2014-01-31 05:26:18
Message-ID: CAGTBQpYbhy-ahV6qzq3Yqp43uS2UrukNmf-GZxHckJVrSPj8VA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 30, 2014 at 9:34 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Peter Geoghegan <pg(at)heroku(dot)com> writes:
>> On more occasions than I care to recall, someone has suggested that it
>> would be valuable to do something with strxfrm() blobs in order to
>> have cheaper locale-aware text comparisons. One obvious place to do so
>> would be in indexes, but in the past that has been dismissed on the
>> following grounds:
>
>> * Index-only scans need fully formed datums to work, and strxfrm() is
>> a one-way function (or so I believe). There is no reason to think that
>> the original string can be reassembled from the blob, so clearly that
>> won't fly.
>
>> * There is a high cost to be paid in storage overhead. Even for
>> collations like "en_US.UTF-8", that can mean that the blob is as much
>> as 3-4 times larger than the original text string. Who is to say that
>> we'll come out ahead even with the savings of just doing a strcmp()
>> rather than a strcoll()?
>
> Quite aside from the index bloat risk, this effect means a 3-4x reduction
> in the maximum string length that can be indexed before getting the
> dreaded "Values larger than 1/3 of a buffer page cannot be indexed" error.
> Worse, a value insertion might well succeed, with the failure happening
> only (much?) later when that entry is chosen as a page split boundary.
>
> It's possible that TOAST compression of the strings would save you, but
> I'm not convinced of that; it certainly doesn't seem like we could
> guarantee no post-insertion failures that way.

Maybe not TOAST compression, but prefix compression.

I've been wondering lately, whether a format change in the B-Tree
could be worth the effort: store values with prefix compression. It
would certainly help indexes of many kinds (text-valued, multi-column,
etc...).

Now, I haven't checked if it's already done. Sorry if it is. I did
mock around btree code a lot and don't remember any of this, but I do
remember stuff that could be used to achieve it (specifically, all the
index-only-scan machinery, which allows for implicit info).

Granted, this isn't strxfrm, but it may provide some of the benefits
(faster comparisons because less bytes are compared?).

On Fri, Jan 31, 2014 at 1:51 AM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> We're already suffering from the fact
> that strcoll() mandates a NULL-terminated string, since we have to
> copy text datums to a buffer to deal with the impedance mismatch.
> Furthermore, multi-column comparisons probably have a lot of overhead
> at present from all the indirection alone.

Which makes the above even more likely to help.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2014-01-31 05:26:42 Re: Backup throttling
Previous Message Peter Geoghegan 2014-01-31 04:51:18 Re: Making strxfrm() blobs in indexes work