Re: suitable text search configuration

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: suitable text search configuration
Date: 2007-10-24 22:17:47
Message-ID: 20071024221747.GA4626@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andrew Dunstan wrote:
>
> Tom Lane wrote:

>> Actually, looking at the examples so far, I'm thinking we should just
>> consider the string up to the first _, period.

I studied the standards a bit to see if they mandated that the locale
names must be in the form "language_COUNTRY", and couldn't find
anything. Which makes me think it's mostly by (very well established)
convention. I think trying to parse the _ should not be done on a first
attempt.

>> An alternative is to try to match the full locale (es_ES) and then try
>> the language (es) if that wasn't found. That would leave room to put
>> country-by-country exceptions in, but for the moment we'd not have any.
>
> Can anyone point to a real world example where country by country would
> make sense? If we need to distinguish flavors of some languages, I would
> not be at all surprised if this was not by country anyway.

pt_BR versus pt_PT. I'm not sure if it makes a difference to a stemmer,
but maybe to a thesaurus it does ...

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Florian Pflug 2007-10-24 22:27:33 Re: Feature Freeze date for 8.4
Previous Message Tom Lane 2007-10-24 22:14:04 Re: suitable text search configuration