Quick Links

Re: prefix btree implementation

From:	"Qingqing Zhou" <zhouqq(at)cs(dot)toronto(dot)edu>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: prefix btree implementation
Date:	2005-10-05 22:40:43
Message-ID:	di1a7o$2j6u$1@news.hub.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

"Alvaro Herrera" <alvherre(at)alvh(dot)no-ip(dot)org> wrote
> On Wed, Oct 05, 2005 at 12:50:41AM -0400, Tom Lane wrote:
>> Qingqing Zhou <zhouqq(at)cs(dot)toronto(dot)edu> writes:
>> > 1/ What types of prefix compression shall we support?
>>
>> Given the requirement of datatype independence, this idea seems a
>> complete nonstarter to me...
>
> How about having each type optionally provide the required routines?
> Thus we could provide them at least for the most common datatypes, and
> the system would continue working as currently for the rest (including
> user-defined types). Cross-column prefixes would be hard to handle I
> guess, as well as TOASTed data.
>

Yes, column-wise should not be difficult since it does require no knowledge
of the data types.

We can treat cross-column or incomplete-column share in two ways. One way
(binary-comparison) is that we just compare the items binaryly, i.e.,
without knowing what in fact is stored. The other way (datatype-comparison)
is we compare the items with some knowlege of the data types. For example,
suppose the index is defined on a varchar column and the examplar data look
like this:

{3|'aaa'}{4|'aaab'}{5|'aaabc'}

The binary-comparison way can't share prefix 'aaa' at all, because it will
see 3, 4 and 5 are totally different. The datatype-comparison way can share
the prefix 'aaa', but it has to know that each varchar column is associated
with a length header. When it compares, it ignores the header. The first way
is easy to implement, and works for some data types like integers, but no
acceptable I guess, since it even does not support varchar column prefix
sharing.

We can find a way to handle the above case, but it is better to find a
general way to handle any data types(include UDT). "Each type optionally
provide the required routines" could be a way, more details?

> One problem I do see is what happens if I need to insert a new tuple in
> the page that doesn't share the prefix. It obviously would have to be
> the leftmost or rightmost item on the page, but it's possible.
>

We do the prefix sharing when we build up index only, never on the fly.

Regards,
Qingqing

In response to

Re: prefix btree implementation at 2005-10-05 12:38:54 from Alvaro Herrera

Responses

Re: prefix btree implementation at 2005-10-06 02:42:32 from Junji TERAMOTO
Re: prefix btree implementation at 2005-10-06 21:31:27 from Jim C. Nasby

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Ron Peacetree	2005-10-05 23:54:15	Re: [HACKERS] A Better External Sort?
Previous Message	Richard Huxton	2005-10-05 20:58:09	Resultset duplicates (was Re: prefix btree implementation)