From: | Listmail <lists(at)peufeu(dot)com> |
---|---|
To: | "Tilmann Singer" <tils-pgsql(at)tils(dot)net>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: tsearch2 dictionary that indexes substrings? |
Date: | 2007-04-20 09:22:21 |
Message-ID: | op.tq2sbjeizcizji@apollo13 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
You want trigram based search.
ie.
postgresql -> 'pos', 'ost', 'stg', 'tgr', 'gre', 'res', 'esq', 'sql'
searching for 'gresq' is searching for 'gre' and 'res' and 'esq' which
is good friends with bitmap scan. Then a little LIKE '%gresq%' to filter
the results.
PS : indexing all substring means for long words you get huge number of
lexems...
On Fri, 20 Apr 2007 11:06:26 +0200, Tilmann Singer <tils-pgsql(at)tils(dot)net>
wrote:
> Hi,
>
> In this http://archive.netbsd.se/?ml=pgsql-hackers&a=2006-10&t=2475028
> thread Teodor Sigaev describes a way to make tsearch2 index substrings
> of words:
>
>> Brain storm method:
>>
>> Develop a dictionary which returns all substring for lexeme, for
>> example for word foobar it will be 'foobar fooba foob foo fo oobar
>> ooba oob oo obar oba ob bar ba ar'
>
> Did anyone do that already?
>
> If I understand it correctly such a dictionary would require to write
> a custom C component - is that correct? Or could I get away with
> writing a plpgsql function that does the above and hooking that
> somehow into the tsearch2 config?
>
>
> thanks, Til
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
From | Date | Subject | |
---|---|---|---|
Next Message | Oleg Bartunov | 2007-04-20 09:27:36 | Re: tsearch2 dictionary that indexes substrings? |
Previous Message | Tilmann Singer | 2007-04-20 09:06:26 | tsearch2 dictionary that indexes substrings? |