From: | alexander lunyov <lan(at)zato(dot)ru> |
---|---|
To: | pgsql-ru-general(at)postgresql(dot)org |
Subject: | Re: full text search, utf8 |
Date: | 2009-06-03 11:56:59 |
Message-ID: | 4A26650B.400@zato.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-ru-general |
I can answer in english if you like.
This error happening also when i'm trying to CREATE TEXT SEARCH DICTIONARY:
ports=# CREATE TEXT SEARCH DICTIONARY ruispell (
ports(# TEMPLATE = ispell,
ports(# DictFile = russian,
ports(# AffFile = russian,
ports(# StopWords = russian
ports(# );
ERROR: неверная последовательность байт имя кодировки "UTF8": 0xd1
ПОДСКАЗКА: This error can also happen if the byte sequence does not
match the encoding expected by the server, which is controlled by
"client_encoding".
ports=#
All data in table populated with perl script that read text file in UTF8
and make INSERTs, and i think if there was illegal character, error
would appear after INSERT.
Andrew Boag wrote:
> sorry for English response (I don't have Russian keyboard here)
>
> 0xd1 may be an illegal UTF8 chaacter that was mistakenly allowed into
> the table. Not all libraries (or all versions of postgres) prevent
> illegal UTF8 characters from getting into DB.
>
> We saw similar issues with a 7.4 -> 8.1 postgres data migration.
>
> However, I don't fully understand your select query so there may be
> another cause.
>
> alexander lunyov wrote:
>> Здравствуйте.
>>
>> Имеется freebsd 6.2, postgresql-8.3.1
>>
>> В env:
>>
>> % env | grep UTF
>> LANG=ru_RU.UTF-8
>> MM_CHARSET=UTF-8
>>
>> % psql ports -U pgsql
>> Welcome to psql 8.3.1, the PostgreSQL interactive terminal.
>>
>> Type: \copyright for distribution terms
>> \h for help with SQL commands
>> \? for help with psql commands
>> \g or terminate with semicolon to execute query
>> \q to quit
>>
>> ports=# \encoding
>> UTF8
>> ports=# \l
>> Список баз данных
>> Имя | Владелец | Кодировка
>> -----------+----------+-----------
>> ports | pgsql | UTF8
>> postgres | pgsql | UTF8
>> template0 | pgsql | UTF8
>> template1 | pgsql | UTF8
>> (4 rows)
>>
>> Пробую поискать в таблице, и вот результат:
>>
>> ports=# select name from abonents where to_tsvector(name) @@
>> to_tsquery('s');
>> ERROR: неверная последовательность байт имя кодировки "UTF8": 0xd1
>> ПОДСКАЗКА: This error can also happen if the byte sequence does not
>> match the encoding expected by the server, which is controlled by
>> "client_encoding".
>>
>> при этом в конфигурации english работает нормально.
>>
>> # select count(name) from abonents where to_tsvector('english',name)
>> @@ to_tsquery('some');
>> count
>> -------
>> 6
>> (1 запись)
>>
>> Почему?
>>
>
>
--
С уважением
Александр Лунев
ОАО РТК
From | Date | Subject | |
---|---|---|---|
Next Message | Сергей Бурладя =?utf-8?B?0L0=?= | 2009-06-03 23:55:14 | Re: full text search, utf8 |
Previous Message | alexander lunyov | 2009-06-03 09:29:27 | full text search, utf8 |