Re: snowball ASCII stemmer configuration

From: Oleg Bartunov <obartunov(at)postgrespro(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: snowball ASCII stemmer configuration
Date: 2020-06-16 14:32:19
Message-ID: CAF4Au4yOx4AG-h--CKwsL7MymaKBYEKgCdxMWJS_QzYZo2Ot+A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 16, 2020 at 4:53 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> writes:
> > There are two cases where these two columns are not the same:
>
> > hindi english \
> > russian english \
>
> > The second one is old; the first one I added using the second one as
> > example. But I wonder what the rationale for this is. Maybe for hindi
> > one could make some kind of cultural argument, but for russian this
> > seems entirely arbitrary.
>
> Perhaps it is, but we have actual Russians who think it's a good idea.
> I recall questioning that point some years ago, and Oleg replied that
> they'd done that intentionally because (a) technical Russian uses a lot
> of English words, and (b) it's easy to tell which is which thanks to
> the disjoint letter sets.
>
>
Yes, you are right.

> Whether the same is true for Hindi, I have no idea.
>
> > Moreover, AFAIK, the following other languages do not use Latin-based
> > alphabets:
>
> > arabic arabic \
> > greek greek \
> > nepali nepali \
> > tamil tamil \
>
> Hmm. I think all of those entries are ones that got added by me while
> absorbing post-2007 Snowball updates, and I confess that I did not think
> about this point. Maybe these should be changed.
>
> regards, tom lane
>
>
>

--
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2020-06-16 14:36:17 Re: Transactions involving multiple postgres foreign servers, take 2
Previous Message Dilip Kumar 2020-06-16 14:19:17 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions