Quick Links

Re: search on accents -> Why not include this function

From:	Patrice Hédé <patrice(at)islande(dot)org>
To:	Jaume Teixi <teixi(at)6tems(dot)com>
Cc:	Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-admin(at)postgresql(dot)org
Subject:	Re: search on accents -> Why not include this function
Date:	2001-03-29 21:00:47
Message-ID:	20010329230047.H6360@idf.net
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-admin

Hi,

First, thank you for having including me in this thread : I haven't
been involved with PostgreSQL for 3 years now, and it's nice to see
that this hack is still useful to some persons ! (I should however
soon get involved again with databases :) ).

About this programme, I agree with Peter that it is too biased to be
included as a standard function. It is biased towards ISO-8859-1, and
towards some european languages I know ("d" or "dh" => "ð" is for
Icelandic, for example)... although "a" => "å" makes sense : not all
people involved with swedish/norwegian/danish have a scandinavic
keyboard, and they may not be sure whether the programme will do the
"aa" => "å" translation correctly (which this function does ;) ).

Back to the subject, though. This function also has another
limitation, namely, it has a fixed length buffer of 4096 bytes, and
that's not so nice (but it takes care of buffer overflows...).

Maybe, if it's not already the case, the source code could be put in a
contribution directory, available for anyone to adapt to his/her
needs without having to go through 3 years of archives, since it seems
to be a fairly common problem. The code should be simple enough for
anyone with a basic knowledge of C to customise :)

I know that localisation, and collation, and "acceptable alternatives"
are following quite different rules from country to country, making it
difficult to come with a general solution. This is why I didn't even
try to make one ;)

Patrice

* Jaume Teixi <teixi(at)6tems(dot)com> [010329 22:04]:
> But the thing is that you must explicity call this function in order
> to use it.
> Also in order to some stetics maybe you should call it
> accents_iso-8859-1 The thing is that this should be consider a big
> need for non-english languages.
>
> On a major approx also could be possible to modify it in order to
> accept parameters to include ('å','à') or ('ca_ES','fr_FR')....
>
> bests,
> jaume.
>
>
> > For the reason I cited above: it is a too abstract approach for
> > many languages and/or applications. For example in Swedish, a
> > search for 'e' should probably include 'é', since most users will
> > not type that in explicitly (it's not on the keyboard), but a
> > search for 'a' should normally not include 'å', since that it a
> > completely separate letter (and it is on the keyboard).
> > Additionally, this particular implementation seems to be
> > ISO-8859-1 charset specific. I know a number of accented
> > letters that are a lot closer "siblings" to 'd' than 'ð' is.
> >
> > --
> > Peter Eisentraut peter_e(at)gmx(dot)net http://yi.org/peter-e/
>

--
Patrice HÉDÉ --------------------------------- patrice(at)islande(dot)org -----
-- Isn't it weird how scientists can imagine all the matter of the
universe exploding out of a dot smaller than the head of a pin, but they
can't come up with a more evocative name for it than "The Big Bang" ?
-- What would _you_ call the creation of the universe ?
-- "The HORRENDOUS SPACE KABLOOIE !" - Calvin and Hobbes
------------------------------------------ http://www.islande.org/ -----

In response to

Re: search on accents -> Why not include this function at 2001-03-29 18:18:11 from Jaume Teixi

Browse pgsql-admin by date

	From	Date	Subject
Next Message	Stefan Huber	2001-03-29 21:12:22	Re: PG 7.0.3 & RH 7 IPC problems?
Previous Message	DHSC Webmaster	2001-03-29 20:44:31	Re: PG 7.0.3 & RH 7 IPC problems?