Re: Google Summer of Code: question about GiST API advancement project

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: GUO Rui <ruig2(at)uci(dot)edu>
Cc: pgsql-hackers(at)postgresql(dot)org, a(dot)korotkov(at)postgrespro(dot)ru
Subject: Re: Google Summer of Code: question about GiST API advancement project
Date: 2019-03-31 17:52:31
Message-ID: 07514C22-17F0-4E6F-AA06-6F3B92C8B6F7@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

> 31 марта 2019 г., в 14:58, GUO Rui <ruig2(at)uci(dot)edu> написал(а):
>
> I'm Rui Guo, a PhD student focusing on database at the University of California, Irvine. I'm interested in the "GiST API advancement" project for the Google Summer of Code 2019 which is listed at https://wiki.postgresql.org/wiki/GSoC_2019#GiST_API_advancement_.282019.29 .
>
> I'm still reading about RR*-tree, GiST and the PostgreSQL source code to have a better idea on my proposal. Meanwhile, I have a very basic and simple question:
>
> Since the chooseSubtree() algorithm in both R*-tree and RR*-tree are heuristic and somehow greedy (e.g. pick the MBB that needs to enlarge the least), is it possible to apply machine learning algorithm to improve it? The only related reference I got is to use deep learning in database join operation (https://arxiv.org/abs/1808.03196) Is it not suitable to use machine learning here or someone already did?

If you are interested in ML and DBs you should definitely look into [0]. You do not have to base your proposal on mentor ideas, you can use your own. Implementing learned indexes - seems reasonable.

RR*-tree algorithms are heuristic in some specific parts, but in general they are designed to optimize very clear metrics. Generally, ML algorithms tend to compose much bigger pile of heuristics and solve less mathematically clear tasks than splitting subtrees or choosing subtree for insertion.
R*-tree algorithms are heuristic only to be faster.

Best regards, Andrey Borodin.

[0] https://arxiv.org/pdf/1712.01208.pdf

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2019-03-31 18:05:43 Re: jsonpath
Previous Message Tom Lane 2019-03-31 17:04:27 Re: speeding up planning with partitions