Quick Links

Re: Accent insensitive search

From:	"Albe Laurenz" <all(at)adv(dot)magwien(dot)gv(dot)at>
To:	"PFC EXTERN" <lists(at)peufeu(dot)com>, Diego Manilla Suárez <diego(dot)manilla(at)xeridia(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Accent insensitive search
Date:	2007-06-21 13:18:50
Message-ID:	AFCCBB403D7E7A4581E48F20AF3E5DB20379F076@EXADV1.host.magwien.gv.at
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

PFC wrote:
>> Hi. I have a few databases created with UNICODE encoding, and I would
>> like to be able to search with accent insensitivity. There's something
>> in Oracle (NLS_COMP, NLS_SORT) and SQL Server (don't remember) to do
>> this, but I found nothing in PostgreSQL, just the 'to_ascii' function,
>> which AFAIK, doesn't work with UNICODE.
>
> The easiest way is to create an extra column which will hold a copy of
> your text, with all accents removed. You can also convert it to lowercase
> and remove apostrophes, punctuation etc. Said column is kept up to date
> with a trigger.

Creating an extra column which holds a copy of the data is certainly not
a very good idea, as it will create unnecessary redundancy (breaking
normal forms), bloat your table, and the trigger will have a certain
performance impact.

My suggestion is to write a function that removes the accents in a string
for your language, let's call it noaccents(text).

Then you can index the column atext with

CREATE INDEX atable_atext_idx ON atable ((noaccents(atext)))

Then every query of the form

... WHERE noaccents(atext) = noaccents('SOMÉTHING')

can use the index.

Yours,
Laurenz Albe

In response to

Re: Accent insensitive search at 2007-06-21 09:56:44 from PFC

Browse pgsql-general by date

	From	Date	Subject
Next Message	Richard Huxton	2007-06-21 13:22:10	Re: Aggregates
Previous Message	Gregory Stark	2007-06-21 13:16:55	Re: 8.2.3 PANIC with "corrupted item pointer"