From: | Arthur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru> |
---|---|
To: | Mart Palmas <Mart(dot)Palmas(at)datel(dot)ee> |
Cc: | "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Re: TO_TSVECTOR acts differently with national charcters |
Date: | 2017-08-24 19:01:34 |
Message-ID: | 20170824190134.GA1699@arthur.localdomain |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Tue, Aug 22, 2017 at 08:53:45AM +0000, Mart Palmas wrote:
>
> The string is converted to vector differently, when the string contains national charcters "äöüõžš".
>
I suppose it is true for all non-ascii characters. It could be fixed by
patching the parser of text search. But maybe someone won't be happy
about it, because it can break backward compatibility.
> Results are:
> 'bar' 'foo' 'toop/6'
> '/6' 'bar' 'foo' 'tüüp'
Do you expect first or second option?
Someone may want not devide words by the "/" character, because "toop/6"
can mean a path:
=# select * from ts_debug('simple', 'toop/6');
alias | description | token | dictionaries | dictionary | lexemes
-------+-------------------+--------+--------------+------------+----------
file | File or path name | toop/6 | {simple} | simple | {toop/6}
(1 row)
--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2017-08-24 19:07:28 | Re: Standby corruption after master is restarted |
Previous Message | Masahiko Sawada | 2017-08-24 15:27:40 | Re: BUG #14788: `pg_restore -c` won't restore schema access privileges. |