Quick Links

Re: Review: GiST support for UUIDs

From:	Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To:	Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc:	Paul Jungwirth <pj(at)illuminatedcomputing(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Review: GiST support for UUIDs
Date:	2015-12-22 13:40:30
Message-ID:	CAB7nPqTD0kqF2i-+6d2i8Gru844vKUmLa47ioJnjzkvWAgS66A@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Sep 15, 2015 at 3:03 PM, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:
>
>
> Paul Jungwirth wrote:
>>>
>>> Or something like this in pseudocode:
>>>
>>> numeric = int8_numeric(*(uint64 *)(&i->data[0])) *
>>> int8_numeric(MAX_INT64) + int8_numeric(*(uint64 *)(&i->data[8]))
>>
>>
>> This is more like what I was hoping for, rather than converting to a
>> string and back. Would you mind confirming for me: int8_numeric turns an
>> int64 into an arbitrary-precision numeric Datum? So there is no overflow
>> risk here?
>
> Sure, no risk. Numeric precision is limited 1000 digits with magnitude 1000
>
>
>>
>> But it looks like int8_numeric takes a *signed* integer. Isn't that a
>> problem? I suppose I could get it working though by jumping through some
>> hoops.
>
> signed vs unsigned problem does not exist actually, because of precision of
> numeric is much better than we need and presence of numeric_abs.
>
>
>> Yes, but that seems like an unrealistic concern. Even "only" 2^64 is
>> 18446744073709551616 records. Or another way of thinking about it, if 64
>
> :) "only"
>
>> bits (or 32) is a good enough penalty input for all the other data
>> types, why not for UUIDs? Keep in mind type 3-5 UUIDs should be evenly
>> distributed. Perhaps we could use the bottom half (instead of the top)
>> to ensure even distribution for type 1 and 2 too.
>
> it must be. But UUID could be taken for unknown source and we can't predict
> distribution. I believe pg generates them correctly, but other generators
> could be not so good.
>
>> It seems to me that using only the top half should be okay, but if you
>> believe it's not I'll go with the int8_numeric approach (in three chunks
>> instead of two to work around signed-vs-unsigned).
>
> Yes, I believe. It is not good case when we can ruin index performance with
> special set of value.
>
> Some difficulty which I see is how to transform numeric penalty to double
> as it requires by GiST. May be, penalty/(INT_MAX64*INT_MAX64 + INT_MAX64)?
> Keep in mind, that penalty is how range will be enlarged after range union,
> so minimal penalty is 0 and maximum is 0xffffffffffffffffffffffffffffffff

This patch has been marked as "returned with feedback" for CF 2015-11
because of the presence of a review and a certain lack of activity..
Please free to move it to next CF if you think that's more adapted.
--
Michael

In response to

Re: Review: GiST support for UUIDs at 2015-09-15 06:03:00 from Teodor Sigaev

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michael Paquier	2015-12-22 13:45:00	Re: Commit fest status for 2015-11
Previous Message	Michael Paquier	2015-12-22 13:36:25	Re: [Proposal] Table partition + join pushdown