Re: Searching for "bare" letters

From: Uwe Schroeder <uwe(at)oss4u(dot)com>
To: pgsql-general(at)postgresql(dot)org
Cc: "Reuven M(dot) Lerner" <reuven(at)lerner(dot)co(dot)il>
Subject: Re: Searching for "bare" letters
Date: 2011-10-02 02:20:08
Message-ID: 201110011920.08784.uwe@oss4u.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> Hi, everyone. I'm working on a project on PostgreSQL 9.0 (soon to be
> upgraded to 9.1, given that we haven't yet launched). The project will
> involve numerous text fields containing English, Spanish, and Portuguese.
> Some of those text fields will be searchable by the user. That's easy
> enough to do; for our purposes, I was planning to use some combination of
> LIKE searches; the database is small enough that this doesn't take very
> much time, and we don't expect the number of searchable records (or
> columns within those records) to be all that large. The thing is, the
> people running the site want searches to work on what I'm calling (for
> lack of a better term) "bare" letters. That is, if the user searches for
> "n", then the search should also match Spanish words containing "ñ". I'm
> told by Spanish-speaking members of the team that this is how they would
> expect searches to work. However, when I just did a quick test using a
> UTF-8 encoded 9.0 database, I found that PostgreSQL didn't see the two
> characters as identical. (I must say, this is the behavior that I would
> have expected, had the Spanish-speaking team member not said anything on
> the subject.) So my question is whether I can somehow wrangle PostgreSQL
> into thinking that "n" and "ñ" are the same character for search purposes,
> or if I need to do something else -- use regexps, keep a "naked,"
> searchable version of each column alongside the native one, or something
> else entirely -- to get this to work. Any ideas?
> Thanks,
> Reuven

What kind of "client" are the users using? I assume you will have some kind
of user interface. For me this is a typical job for a user interface. The
number of letters with "equivalents" in different languages are extremely
limited, so a simple matching routine in the user interface should give you a
way to issue the proper query.

Uwe

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Cody Caughlan 2011-10-02 02:31:59 Re: Searching for "bare" letters
Previous Message planas 2011-10-02 02:09:43 Re: Searching for "bare" letters