From: | "Qingqing Zhou" <zhouqq(at)cs(dot)toronto(dot)edu> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: prefix btree implementation |
Date: | 2005-10-05 22:40:43 |
Message-ID: | di1a7o$2j6u$1@news.hub.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
"Alvaro Herrera" <alvherre(at)alvh(dot)no-ip(dot)org> wrote
> On Wed, Oct 05, 2005 at 12:50:41AM -0400, Tom Lane wrote:
>> Qingqing Zhou <zhouqq(at)cs(dot)toronto(dot)edu> writes:
>> > 1/ What types of prefix compression shall we support?
>>
>> Given the requirement of datatype independence, this idea seems a
>> complete nonstarter to me...
>
> How about having each type optionally provide the required routines?
> Thus we could provide them at least for the most common datatypes, and
> the system would continue working as currently for the rest (including
> user-defined types). Cross-column prefixes would be hard to handle I
> guess, as well as TOASTed data.
>
Yes, column-wise should not be difficult since it does require no knowledge
of the data types.
We can treat cross-column or incomplete-column share in two ways. One way
(binary-comparison) is that we just compare the items binaryly, i.e.,
without knowing what in fact is stored. The other way (datatype-comparison)
is we compare the items with some knowlege of the data types. For example,
suppose the index is defined on a varchar column and the examplar data look
like this:
{3|'aaa'}{4|'aaab'}{5|'aaabc'}
The binary-comparison way can't share prefix 'aaa' at all, because it will
see 3, 4 and 5 are totally different. The datatype-comparison way can share
the prefix 'aaa', but it has to know that each varchar column is associated
with a length header. When it compares, it ignores the header. The first way
is easy to implement, and works for some data types like integers, but no
acceptable I guess, since it even does not support varchar column prefix
sharing.
We can find a way to handle the above case, but it is better to find a
general way to handle any data types(include UDT). "Each type optionally
provide the required routines" could be a way, more details?
> One problem I do see is what happens if I need to insert a new tuple in
> the page that doesn't share the prefix. It obviously would have to be
> the leftmost or rightmost item on the page, but it's possible.
>
We do the prefix sharing when we build up index only, never on the fly.
Regards,
Qingqing
From | Date | Subject | |
---|---|---|---|
Next Message | Ron Peacetree | 2005-10-05 23:54:15 | Re: [HACKERS] A Better External Sort? |
Previous Message | Richard Huxton | 2005-10-05 20:58:09 | Resultset duplicates (was Re: prefix btree implementation) |