Re: tsearch2 dictionary that indexes substrings?

From: Tilmann Singer <tils-pgsql(at)tils(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: tsearch2 dictionary that indexes substrings?
Date: 2007-04-20 10:21:51
Message-ID: 20070420102151.GA15825@tils.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

* Listmail <lists(at)peufeu(dot)com> [20070420 11:25]:
> You want trigram based search.
> ie.
>
> postgresql -> 'pos', 'ost', 'stg', 'tgr', 'gre', 'res',
> 'esq', 'sql'
>
> searching for 'gresq' is searching for 'gre' and 'res' and
> 'esq' which is good friends with bitmap scan. Then a little LIKE
> '%gresq%' to filter the results.

I'm not sure how that would fit in with tsearch2 to do full text
search so that I can do queries like

select * from content where plainto_tsquery(:q) @@ to_tsvector(body)

If the possible substrings were already indexed like Oleg suggested in
his reply through writing a custom C dictionary, a query like above
with q='foo' would find rows from the table content where body
contains 'foobar' for instance.

However I've seen the example to create a trigram index on a unique
word list to provide alternative spelling suggestions to the user
which looked very useful.

> PS : indexing all substring means for long words you get huge
> number of lexems...

I'm aware of that and in my case I don't think it will be a
problem. It is for a type-ahead search web interface so actually it
only requires indexing all possible substrings starting from char 1,
ie. p, po, pos, post, postg, postgr, postgre, postgres, postgresq,
postgresql.

Til

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Mike Frysinger 2007-04-20 10:35:24 Re: cant get pg_dump/pg_restore to behave
Previous Message Oleg Bartunov 2007-04-20 09:27:36 Re: tsearch2 dictionary that indexes substrings?