Re: GSoC proposal. Index-only scans for GIST

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: Re: GSoC proposal. Index-only scans for GIST
Date: 2014-03-18 17:16:27
Message-ID: 53287F6B.7040707@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alexander,

Can you comment on the below proposal? I'd like your opinion on how
difficult it will be.

On 03/18/2014 06:12 AM, Anastasia Lubennikova wrote:
> Hello!
> Here is the text of my proposal which I've applied to GSoC.
> (and link
> http://www.google-melange.com/gsoc/proposal/public/google/gsoc2014/lubennikovaav/5629499534213120)
> Any suggestions and comments are welcome.
>
> *Project name*
>
> Support for index-only scans for GIST index
>
>
>
> *Brief review*
>
> Currently GiST index don't have index-only scan functionality. Index-only
> scans are a major performance feature added to Postgres 9.2. They allow
> certain types of queries to be satisfied just by retrieving data from
> indexes, and not from tables. This feature for B-trees (implemented since
> PostgreSQL-9.2) allows doing certain types of queries significantly faster.
> This is achieved by reduction in the amount of I/O necessary to satisfy
> queries. I think it will be useful to implement this feature for user
> defined data types that use GiST index.
>
>
>
> *Benefits to the PostgreSQL Community*
>
> Faster GiST index search would be actual for many PostgreSQL applications
> (for example some GIS systems).
>
>
> *Quantifiable results*
>
> Acceleration of GiST index search.
>
>
> *Project details*
>
> 1. The GiST is a balanced tree structure like a B-tree, containing <key,
> pointer> pairs. GiST key is a member of a user-defined class, and
> represents some property that is true of all data items reachable from the
> pointer associated with the key. The GiST provides a possibility to create
> custom data types with indexed access methods and extensible set of queries.
>
> There are seven methods that an index operator class for GiST must provide,
> and an eighth that is optional.
>
> -consistent
>
> -union
>
> -compress
>
> -decompress
>
> -penalty
>
> -picksplit
>
> -equal
>
> -distance (optional)
>
> I'm going to create new optional method fetch() in addition to existing.
> Thus user can create a method of retrieving data from the index without
> loss. It will be used when performing search queries to speed data
> retrieving.
>
>
>
> 2. gistget algorithm (several parts omitted):
>
> Check the key
> gistindex_keytest() - does this index tuple satisfy the scan key(s)?
>
> Scan all items on the GiST index page and insert them into the queue (or
> directly to output areas)
>
> plain scan
>
> Heap tuple TIDs are returned into so->pageData[]
>
> ordered scan
>
> Heap tuple TIDs are pushed into individual search queue items
>
> If the fetch() is specified by the developer, then using it, algorithm can
> retrieve the data directly to output areas at this stage, without reference
> to the heap.
>
>
> 3. Create function gistcanreturn to check whether fetch() is specified by
> user.
>
> Amcanreturn - Function to check whether index supports index-only scans, or
> zero if none
>
> There is the question of support index-only scans for multicolumn indexes.
> Probably it would require extend the interface to add separate columns
> checking.
>
> To solve this question I need the community's help.
>
>
> 4. Add details for index only scans into gistcostestimate function
>
>
>
> *Links*
>
> 1) Hellerstein J. M., Naughton J. F., Pfeffer A. Generalized search
> trees for database systems. - September, 1995.
>
> 2) http://www.sai.msu.su/~megera/postgres/gist/
>
> 3) PostgreSQL 9.3.3 Documentation: chapters 11. Indexes, 55. GiST
> Indexes.
>

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-03-18 17:16:46 Re: Portability issues in shm_mq
Previous Message Greg Stark 2014-03-18 17:13:52 json/jsonb/hstore operator precedence