Re: [GENERAL] I did some testing of GIST/GIN vs BTree indexing…

From: Guyren Howe <guyren(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: PostgreSQL General <pgsql-general(at)postgresql(dot)org>
Subject: Re: [GENERAL] I did some testing of GIST/GIN vs BTree indexing…
Date: 2014-12-11 01:27:16
Message-ID: D64A408B-A78D-4417-92BE-61178B71C3B8@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Dec 6, 2014, at 12:38 , Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>
> On Wed, Dec 3, 2014 at 01:15:50AM -0800, Guyren Howe wrote:
>> GIN is certainly not the “three times” size suggested in the docs, but perhaps
>> that just hasn’t been updated for the 9.4 improvements. Certainly, there isn’t
>> sufficient difference here to make the BTree advantage compelling in most
>> applications.
>
> I am sure the docs need updating for 9.4 — any suggestions?

I want to get to the point where I can make fairly definitive statements about indexing regular fields with GIST or GIN. When I do, I’ll be happy to write something for the docs. If folks here can help me get to that point, all to the better of all… :-)

>> Given the futility of database benchmarking in general, I didn’t want to go any
>> further with this. What I was interested in was whether it might be worth
>> switching from BTree to GIST/GIN indexes with regular sorts of data. It appears
>> to be the case that GIST and GIN are often better than BTree in general, and
>> given their much greater flexibility in satisfying queries on different
>> columns, it might even be the case that one should recommend a single GIST or
>> GIN index on the frequently-searched columns of a table in most cases?
>
> What GiST and GIN "ops" did you use for the testing? Was it
> contrib/btree_gist and contrib/btree_gin?

Sorry; yes. I didn’t realize there was any practical alternative. Is there another option I should test?

> You might want to look at my presentation on indexing:
>
> http://momjian.us/main/presentations/features.html#indexing
>
> It is my understanding that btree is best for single-match indexes like
> unique indexes, or range queries (not range data types), while GIN is
> best for indexes with many duplicates. GiST is more of an indexing
> framework and I am unclear where it is best except in cases where is the
> only option, like geometry and perhaps range (shared with SP-GiST).
> With the 9.4 GIN improvements I am unclear if GiST is ever better for
> full text indexing compared to GIN.

Thanks for this. I will look at your presentation.

As I say, if folks can help me work out the definitive answer to all this, I’d love to contribute it to the docs.

My starting point was this: given that GIN (and GIST, maybe, the docs sort-of say “sort of”) can use arbitrary index fields, rather than left to right, if you’re in a situation of wanting to query arbitrary subsets of some of the fields on a table, it seems likely that a GIN index might be called for. Is that right? The description I’ve been able to find (that it’s a BTree with more sophisticated handling of duplicates) would surely entail otherwise, but this is clearly what the docs say.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2014-12-11 01:53:30 Re: Stored procedure workflow question
Previous Message Guyren Howe 2014-12-11 01:20:12 Re: service allowing arbitrary relations was Re: hstore, but with fast range comparisons?