From: | Andreas Karlsson <andreas(at)proxel(dot)se> |
---|---|
To: | emre(at)hasegeli(dot)com |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: GiST support for inet datatypes |
Date: | 2014-01-31 01:58:09 |
Message-ID: | 52EB0331.5090500@proxel.se |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 01/23/2014 11:22 PM, Emre Hasegeli wrote:
> inet-gist
> ---------
> I realized that create extension btree_gist fails with the patch.
>
> ERROR: could not make operator class "gist_inet_ops" be default for type inet
> DETAIL: Operator class "inet_ops" already is the default.
>
> I think making the new operator class default makes more sense. Name conflict
> can be a bigger problem. Should I rename the new operator class?
I agree that the new operator class should be the default one, it is
more useful and also not buggy. No idea about the naming.
>> The only thing is that I think your index would do poorly on the case where
>> all values share a prefix before the netmask but have different values after
>> the netmask, since gist_union and gist_penalty does not care about the bits
>> after the netmask. Am I correct? Have you thought anything about this?
>
> Yes, I have tried to construct the index with ip_maxbits() instead of ip_bits()
> but I could not make it work with the cidr type. It can be done by adding
> operator as a parameter for union, penalty and picksplit functions on the
> GiST framework. I am not sure it worths the trouble. It would only help
> the basic comparison operators and it would slow down other operators because
> network length comparison before network address would not be possible for
> the intermediate nodes. I mentioned this behavior of the operator class on
> the paragraph added to the documentation.
I think this is fine since it does not seem like a very useful case to
support to me. Would be worth doing only if it is easy to do.
A separate concern: we still have not come to a decision about operators
for inet here. I do not like the fact that the operators differ between
ranges and inet. But I feel this might be out of scope for this patch.
Does any third party want to chime in with an opinion?
Current inet/cidr
<< is contained within
<<= is contained within or equals
>> contains
>>= contains or equals
Range types
@> contains range
<@ range is contained by
&& overlap (have points in common)
<< strictly left of
>> strictly right of
>> inet-selfuncs
>> -------------
>>
>> Here I have to honestly admit I am not sure if I can tell if your
>> selectivity function is correct. Your explanation sounds convincing and the
>> general idea of looking at the histogram is correct. I am just have no idea
>> if the details are right.
>
> I tried to improve it in this version. I hope it is more readable now. I used
> the word inclusion instead of overlap, made some changes on the algorithm
> to make it more suitable to the other operators.
>
> Using the histogram for inclusion which is originally for basic comparison,
> is definitely not correct. It is an empirical approach. I have tried several
> versions of it, tested them with different data sets and thought it is better
> than nothing.
Thanks for the updates. The code in this version is cleaner and easier
to follow.
I am not convinced of your approach to calculating the selectivity from
the histogram. The thing I have the problem with is the clever trickery
involved with how you handle different operator types. I prefer the
clearer code of the range types with how calc_hist_selectivity_scalar is
used. Is there any reason for why that approach would not work here or
result in worse code?
I have attached a patch where I improved the language of your comment
describing the workings of the selectivity estimation. Could you please
check it so I did not change the meaning of any of it?
I see from the tests that you still are missing selectivity functions
for operators, what is your plan for this?
--
Andreas Karlsson
Attachment | Content-Type | Size |
---|---|---|
inet-selfuncs-commentfix.patch | text/x-patch | 6.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2014-01-31 02:33:25 | Re: Making strxfrm() blobs in indexes work |
Previous Message | Merlin Moncure | 2014-01-31 01:13:08 | Re: jsonb and nested hstore |