From: | Krzysztof xaru Rajda <krzysztof(dot)xaru(dot)rajda(at)gmail(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | [tsearch2] Problem with case sensitivity (or with creating own dictionary) |
Date: | 2013-08-05 09:44:39 |
Message-ID: | 51FF7407.9090202@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hello,
I encountered such a problem. my goal is to extract links from a text
using tsearch2. Everything seemed to be well, unless I got some youtube
links - there are some small and big letters inside, and a tsearch
parser is lowering everything (from http://youtube.com/Y6dsHDX I got
http://youtube.com/y6dshdx, which is not working). I went through
PostgreSQL docs, and it seem that each of default dictionaries (simple,
ispell, snowball) are lowering lexems during normalization, and there is
no option to disable it.
I started to look for some tutorials, how to create own dictionary, or
modify existing one (I'm talking about dictionary like snowball, with my
own source code - not just a dictionary created by 'CREATE
DICTIONARY...' query), but all I found is really out-of-date, and uses
some mechanisms that are deprecated in latest version of Postgres (I'm
working on v 9.2) - like 'contrib/gendict' here:
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/custom-dict.html
<http://www.sai.msu.su/%7Emegera/postgres/gist/tsearch/V2/docs/custom-dict.html>
So now, I have no idea what to do with my case sensitivity problem... Is
there any other way to overcome it, apart from creating own dictionary?
If no - how to create one on the Postgres 9.2?
Regards,
xaru
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2013-08-05 10:28:41 | Re: Bottlenecks with large number of relation segment files |
Previous Message | Luca Ferrari | 2013-08-05 09:44:09 | Re: incremental dumps |