From: | Florian Pflug <fgp(at)phlo(dot)org> |
---|---|
To: | sushant354(at)gmail(dot)com |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: pg_trgm: unicode string not working |
Date: | 2011-06-12 12:40:06 |
Message-ID: | 0A0622C3-C778-4C6A-9345-0A6D81929BCF@phlo.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi
Next time, please post questions regarding the usage of postgres
to the -general list, not to -hackers. The purpose of -hackers is
to discuss the development of postgres proper, not the development
of applications using postgres.
On Jun12, 2011, at 13:33 , Sushant Sinha wrote:
> I am using pg_trgm for spelling correction as prescribed in the
> documentation. But I see that it does not work for unicode sring. The
> database was initialized with utf8 encoding and the C locale.
I think you need to use a locale (more precisely, a CTYPE) in which
'क', 'त', 'द' are considered to be alphanumeric.
You can specify the CTYPE when creating the database with
CREATE DATABASE ... LC_CTYPE = ...
> Here is the table:
> \d words
> Table "public.words"
> Column | Type | Modifiers
> --------+---------+-----------
> word | text |
> ndoc | integer |
> nentry | integer |
> Indexes:
> "words_idx" gin (word gin_trgm_ops)
>
> Query: select word from words where word % 'कतद';
>
> I get an error:
>
> ERROR: GIN indexes do not support whole-index scans
pg_trgm probably ignores non-alphanumeric characters during
comparison, so you end up with an empty search string, which
translates to a whole-index scan. Postgres up to 9.0 does
not support such scans for GIN indices.
Note that this restriction was removed in postgres 9.1 which
is currently in beta. However, GIT indices must be re-created
with REINDEX after upgrading from 9.0 to leverage that
improvement.
best regards.
Florian Pflug
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2011-06-12 14:12:36 | Re: Creating new remote branch in git? |
Previous Message | Florian Pflug | 2011-06-12 11:53:08 | Re: Range Types and extensions |