Re: GSoC proposal. Index-only scans for GIST

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GSoC proposal. Index-only scans for GIST
Date: 2014-03-18 19:11:31
Message-ID: CAPpHfdu1WV7v6aF7kxTomrwpWq4xB=GBpEqBn89zOtTx6pFnjQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Josh,

Anastasia has already consulted to me in person. It is not big proposal.
But for newbie who is not familiar with PostgreSQL code base and especially
GiST it seems fair enough.

----
With best regards,
Alexander Korotkov.

On Tue, Mar 18, 2014 at 9:16 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:

> Alexander,
>
> Can you comment on the below proposal? I'd like your opinion on how
> difficult it will be.
>
> On 03/18/2014 06:12 AM, Anastasia Lubennikova wrote:
> > Hello!
> > Here is the text of my proposal which I've applied to GSoC.
> > (and link
> >
> http://www.google-melange.com/gsoc/proposal/public/google/gsoc2014/lubennikovaav/5629499534213120
> )
> > Any suggestions and comments are welcome.
> >
> > *Project name*
> >
> > Support for index-only scans for GIST index
> >
> >
> >
> > *Brief review*
> >
> > Currently GiST index don't have index-only scan functionality. Index-only
> > scans are a major performance feature added to Postgres 9.2. They allow
> > certain types of queries to be satisfied just by retrieving data from
> > indexes, and not from tables. This feature for B-trees (implemented since
> > PostgreSQL-9.2) allows doing certain types of queries significantly
> faster.
> > This is achieved by reduction in the amount of I/O necessary to satisfy
> > queries. I think it will be useful to implement this feature for user
> > defined data types that use GiST index.
> >
> >
> >
> > *Benefits to the PostgreSQL Community*
> >
> > Faster GiST index search would be actual for many PostgreSQL applications
> > (for example some GIS systems).
> >
> >
> > *Quantifiable results*
> >
> > Acceleration of GiST index search.
> >
> >
> > *Project details*
> >
> > 1. The GiST is a balanced tree structure like a B-tree, containing <key,
> > pointer> pairs. GiST key is a member of a user-defined class, and
> > represents some property that is true of all data items reachable from
> the
> > pointer associated with the key. The GiST provides a possibility to
> create
> > custom data types with indexed access methods and extensible set of
> queries.
> >
> > There are seven methods that an index operator class for GiST must
> provide,
> > and an eighth that is optional.
> >
> > -consistent
> >
> > -union
> >
> > -compress
> >
> > -decompress
> >
> > -penalty
> >
> > -picksplit
> >
> > -equal
> >
> > -distance (optional)
> >
> > I'm going to create new optional method fetch() in addition to existing.
> > Thus user can create a method of retrieving data from the index without
> > loss. It will be used when performing search queries to speed data
> > retrieving.
> >
> >
> >
> > 2. gistget algorithm (several parts omitted):
> >
> > Check the key
> > gistindex_keytest() - does this index tuple satisfy the scan key(s)?
> >
> > Scan all items on the GiST index page and insert them into the queue (or
> > directly to output areas)
> >
> > plain scan
> >
> > Heap tuple TIDs are returned into so->pageData[]
> >
> > ordered scan
> >
> > Heap tuple TIDs are pushed into individual search queue items
> >
> > If the fetch() is specified by the developer, then using it, algorithm
> can
> > retrieve the data directly to output areas at this stage, without
> reference
> > to the heap.
> >
> >
> > 3. Create function gistcanreturn to check whether fetch() is specified by
> > user.
> >
> > Amcanreturn - Function to check whether index supports index-only scans,
> or
> > zero if none
> >
> > There is the question of support index-only scans for multicolumn
> indexes.
> > Probably it would require extend the interface to add separate columns
> > checking.
> >
> > To solve this question I need the community's help.
> >
> >
> > 4. Add details for index only scans into gistcostestimate function
> >
> >
> >
> > *Links*
> >
> > 1) Hellerstein J. M., Naughton J. F., Pfeffer A. Generalized search
> > trees for database systems. - September, 1995.
> >
> > 2) http://www.sai.msu.su/~megera/postgres/gist/
> >
> > 3) PostgreSQL 9.3.3 Documentation: chapters 11. Indexes, 55. GiST
> > Indexes.
> >
>
>
> --
> Josh Berkus
> PostgreSQL Experts Inc.
> http://pgexperts.com
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2014-03-18 19:13:12 Re: GSoC proposal. Index-only scans for GIST
Previous Message Robert Haas 2014-03-18 19:10:26 Re: Portability issues in shm_mq