From: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> |
Subject: | Re: spgist text_ops and LIKE |
Date: | 2012-02-02 19:07:21 |
Message-ID: | CAPpHfduR4898K+heSgBtTyiNoUF1fqOdo3KTHvmnQzf1CFU69g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Feb 2, 2012 at 10:21 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I'm having trouble figuring out under what set of circumstances spgist
> is expected to be the best available alternative. It only supports a
> small subset of the data types that GiST does, so I suppose the point
> is that it should be faster for the cases that it does handle. And,
> considering that this is all brand-new code, the fact that it's almost
> keeping pace with btree on both pattern-matching and equality
> comparisons is certainly respectable -- but I so far haven't found any
> cases where it's a clear win. There's limited glory in being the
> almost-fastest way of indexing for a certain class of queries.
>
I think that current implementation of suffix tree (in particular
implementation of consistent methods) in spgist is far from reveal full
potential of suffix tree. I see at least two additional search types which
can be efficiently evaluated using suffix tree:
1) Search by prefix regular expression. For example, search for
/^a(bc|d)e[fgh]/ in single index scan.
2) Search by levenshtein distance, i.e. "SELECT * FROM tbl WHERE
levenshtein(col, 'some constant') < k;" using index (surely actual syntax
for such index scan would be anouther).
This is only two additional applications which first comes to my mind,
there could be other. Unfortunately, I don't have enough of time for it
now. But I like to put my hands on this in future.
Admittedly, I haven't tried the point-in-box stuff yet.
>
In short, I believe spgist for geometrical data-types is more CPU-effective
and less IO-effective than gist. Gist have to call consistent method for
all index tuples of the page it uses in scan, while spgist scans smaller
nodes. Sure, comprehensive testing is required for doing final conclusion
about that.
-----
With best regards,
Alexander Korotkov.
From | Date | Subject | |
---|---|---|---|
Next Message | Magnus Hagander | 2012-02-02 19:14:12 | Re: keywords in pg_hba.conf are case-sensitive? |
Previous Message | Tom Lane | 2012-02-02 19:02:05 | Re: spgist text_ops and LIKE |