Re: [tsearch2] Problem with case sensitivity (or with creating own dictionary)

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Krzysztof xaru Rajda <krzysztof(dot)xaru(dot)rajda(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: [tsearch2] Problem with case sensitivity (or with creating own dictionary)
Date: 2013-08-05 16:37:38
Message-ID: Pine.LNX.4.64.1308052034440.32201@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Please,

take a look on contrib/dict_int and create your own dict_noop.
It should be easy. I think you could document it and share
with people (wiki.postgresql.org ?), since there were other people
interesting in noop dictionary. Also, don't forget to modify
your configuration - use ts_debug(), it will helps you.

Regards,
Oleg

On Sat, 3 Aug 2013, Krzysztof xaru Rajda wrote:

> Hello,
>
> I encountered such a problem. my goal is to extract links from a text using
> tsearch2. Everything seemed to be well, unless I got some youtube links -
> there are some small and big letters inside, and a tsearch parser is lowering
> everything (from http://youtube.com/Y6dsHDX I got http://youtube.com/y6dshdx,
> which is not working). I went through PostgreSQL docs, and it seem that each
> of default dictionaries (simple, ispell, snowball) are lowering lexems during
> normalization, and there is no option to disable it.
>
> I started to look for some tutorials, how to create own dictionary, or modify
> existing one (I'm talking about dictionary like snowball, with my own source
> code - not just a dictionary created by 'CREATE DICTIONARY...' query), but
> all I found is really out-of-date, and uses some mechanisms that are
> deprecated in latest version of Postgres (I'm working on v 9.2) - like
> 'contrib/gendict' here:
> http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/custom-dict.html
> <http://www.sai.msu.su/%7Emegera/postgres/gist/tsearch/V2/docs/custom-dict.html>
>
> So now, I have no idea what to do with my case sensitivity problem... Is
> there any other way to overcome it, apart from creating own dictionary? If no
> - how to create one on the Postgres 9.2?
>
> Regards,
> xaru
>
>
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru)
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alvaro Herrera 2013-08-05 17:04:43 Re: Seamless replacement to MySQL's GROUP_CONCAT function...
Previous Message Jeremy Whiting 2013-08-05 14:58:01 Re: [JDBC] Incorrect response code after XA recovery