Re: Full text index without accents

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: "Fco(dot) Mario Barcala =?ISO-8859-1?Q?Rodr=EDguez?=" <lbarcala(at)freeresearch(dot)org>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Full text index without accents
Date: 2008-07-22 22:27:03
Message-ID: Pine.LNX.4.64.0807230226120.11363@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Here is an example

CREATE FUNCTION dropatsymbol(text) RETURNS text
AS 'select replace($1, ''@'', '' '');'
LANGUAGE SQL;

arxiv=# select to_tsvector('english',dropatsymbol('oleg(at)sai(dot)msu(dot)su'));
to_tsvector
-------------------------
'oleg':1 'sai.msu.su':2

On Tue, 22 Jul 2008, Fco. Mario Barcala Rodr?guez wrote:

> And which are the types of argument and returning values of a pl/sql
> function which preprocess de text?
>
> I have been searching that, for example, something like this works fine:
>
> CREATE INDEX textindex ON document USING
> gin(to_tsvector('english',upper(text)));
>
> where text is the text column of document. But I have tried to do
> something like:
>
> CREATE INDEX textindex ON document USING
> gin(to_tsvector('english',myfunction(text)));
>
> where myfunction is a PL/SQL function which call upper one, but I didn't
> find which are the types of the myfunction argument and returning value.
>
> I am a PL/SQL novice and I didn't find how to do it yet. Of course, then
> I will have to change upper experiment to my objective: to index without
> accents. I don't know if PL/SQL is the better option to build such
> function.
>
> Thanks,
>
> Mario Barcala
>
>> You can preprocess text (replace accent by nothing) before
>> to_tsvector or to_tsquery
>>
>>
>>
>> Oleg
>> On Thu, 3 Jul 2008, lbarcala(at)freeresearch(dot)org wrote:
>>
>>> Hi again:
>>>
>>> I am trying to create a full text configuration to ignore word accents in
>>> my searches. My approach is similar to simple dicionary one, but i want to
>>> remove accents after converting to lower.
>>>
>>> Is it the only way to do it to develop another .c and write my own
>>> dict_noaccent.c, and then compile and install it into the system?
>>>
>>> Regars,
>>>
>>> Mario Barcala
>>>
>>>
>>>
>>
>> Regards,
>> Oleg
>> _____________________________________________________________
>> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru)
>> Sternberg Astronomical Institute, Moscow University, Russia
>> Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
>> phone: +007(495)939-16-83, +007(495)939-23-83
>
>
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru)
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Steve Martin 2008-07-22 22:49:15 Re: Substitute a variable in PL/PGSQL.
Previous Message Francisco Reyes 2008-07-22 22:13:16 Re: COPY between 7.4.x and 8.3.x