| From: | Marcin Mańk <marcin(dot)mank(at)gmail(dot)com> |
|---|---|
| To: | Gordon Mohr <gojomo-pgsql(at)xavvy(dot)com> |
| Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: high-dimensional knn-GIST tests (was Re: Cube extension kNN support) |
| Date: | 2013-10-27 20:43:54 |
| Message-ID: | CAK61fk4gh8qRc_0+yig4VnjCPpizUt-dq=dguxUVQ-D=Ztx_Ng@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Thu, Oct 24, 2013 at 3:50 AM, Gordon Mohr <gojomo-pgsql(at)xavvy(dot)com> wrote:
> On 9/22/13 4:38 PM, Stas Kelvich wrote:
>
>> Hello, hackers.
>>
>> Here is the patch that introduces kNN search for cubes with
>> euclidean, taxicab and chebyshev distances.
>>
>
> Thanks for this! I decided to give the patch a try at the bleeding edge
> with some high-dimensional vectors, specifically the 1.4 million
> 1000-dimensional Freebase entity vectors from the Google 'word2vec' project:
>
I believe the curse of dimensionality is affecting you here. I think it is
impossible to get an improvement over sequential scan for 1000 dimensional
vectors. Read here:
http://en.wikipedia.org/wiki/Curse_of_dimensionality#k-nearest_neighbor_classification
Regards
Marcin Mańk
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Craig Ringer | 2013-10-28 01:51:00 | Re: CLUSTER FREEZE |
| Previous Message | Pavel Stehule | 2013-10-27 09:40:29 | Re: proposal: lob conversion functionality |