From: | Tilmann Singer <tils-pgsql(at)tils(dot)net> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: tsearch2 dictionary that indexes substrings? |
Date: | 2007-04-20 10:21:51 |
Message-ID: | 20070420102151.GA15825@tils.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
* Listmail <lists(at)peufeu(dot)com> [20070420 11:25]:
> You want trigram based search.
> ie.
>
> postgresql -> 'pos', 'ost', 'stg', 'tgr', 'gre', 'res',
> 'esq', 'sql'
>
> searching for 'gresq' is searching for 'gre' and 'res' and
> 'esq' which is good friends with bitmap scan. Then a little LIKE
> '%gresq%' to filter the results.
I'm not sure how that would fit in with tsearch2 to do full text
search so that I can do queries like
select * from content where plainto_tsquery(:q) @@ to_tsvector(body)
If the possible substrings were already indexed like Oleg suggested in
his reply through writing a custom C dictionary, a query like above
with q='foo' would find rows from the table content where body
contains 'foobar' for instance.
However I've seen the example to create a trigram index on a unique
word list to provide alternative spelling suggestions to the user
which looked very useful.
> PS : indexing all substring means for long words you get huge
> number of lexems...
I'm aware of that and in my case I don't think it will be a
problem. It is for a type-ahead search web interface so actually it
only requires indexing all possible substrings starting from char 1,
ie. p, po, pos, post, postg, postgr, postgre, postgres, postgresq,
postgresql.
Til
From | Date | Subject | |
---|---|---|---|
Next Message | Mike Frysinger | 2007-04-20 10:35:24 | Re: cant get pg_dump/pg_restore to behave |
Previous Message | Oleg Bartunov | 2007-04-20 09:27:36 | Re: tsearch2 dictionary that indexes substrings? |