From: | Michael Schuh <schuh(dot)mike(at)gmail(dot)com> |
---|---|
To: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | GSOC Student Project Idea |
Date: | 2013-04-22 12:28:21 |
Message-ID: | CAA43Kd3_CA08=QkO6LJ_gftKEU+AmYKvmW6WsdnP_k0mORsOAQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Greetings,
Hello, my name is Michael Schuh and I am a PhD student in Computer Science
at Montana State University. I have never participated in GSOC before, but
I am very excited to propose a project to PostgreSQL that I feel would be a
great follow-up to last year's project by Alexander Korotkov (
http://www.google-melange.com/gsoc/project/google/gsoc2012/akorotkov/53002)
I contacted Mr. Korotkov's mentor from last year, Mr. Heikki Linnakangas,
and he suggested I email this mailing list with my idea.
In brief, I would like to implement a state-of-the-art indexing algorithm
(named iDistance) directly in PostgreSQL using GiST or SP-GiST trees and
whatever means necessary. It is an ideal follow-up to last year's project
with Mr. Korotkov, which implemented classical indexing structures for
range queries. I strongly believe the community would greatly benefit from
the inclusion of iDistance, which has been shown to be dramatically more
effective than R-trees and KD-trees, especially for knn queries and above
10-20 dimensions.
A major focus of my current PhD thesis is high-dimensional data indexing
and retrieval, with an emphasis towards applied use in CBIR systems.
Recently, I published work which introduced a new open source
implementation of iDistance in C++ (and some Python), which I believe makes
me highly qualified and motivated for this opportunity. I have been
strongly considering a PostgreSQL implementation for an easy plug-and-play
use in existing applications, but with academic grant funding, the priority
is low. Below are links to my google code repository and recent
publication. I am happy to discuss any of this in further detail if you'd
like.
https://code.google.com/p/idistance/
http://www.cs.montana.edu/~timothy.wylie/files/bncod13.pdf
Although I do not have a lot of experience with PostgreSQL development, I
am eager to learn and commit my summer to enabling another fantastic
feature for the community. Since iDistance is a non-recursive, data-driven,
space-based partitioning strategy which builds directly onto a B+-tree, I
believe the implementation should be possible using only GiST support.
Please let me know if this is of any interest, or if you have any
additional questions. Unfortunately, I will be unavailable most of the day,
but I plan to fill out the GSOC application later this evening.
Thank you for your time,
Mike Schuh
From | Date | Subject | |
---|---|---|---|
Next Message | Jeevan Chalke | 2013-04-22 13:05:04 | REFRESH MATERIALIZED VIEW command in PL block hitting Assert |
Previous Message | Heikki Linnakangas | 2013-04-22 08:29:33 | Re: Fast promotion, loose ends |