From: | Panagiotis Mavrogiorgos <pmav99(at)gmail(dot)com> |
---|---|
To: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Feature: Add Greek language fulltext search |
Date: | 2019-07-09 14:18:00 |
Message-ID: | CAAVvtwrnGCoiG5csey14=mrn_jTUEO2R2TzUWR2+TuezA3wR3A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Jul 4, 2019 at 1:39 PM Peter Eisentraut <
peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
> On 2019-03-25 12:04, Panagiotis Mavrogiorgos wrote:
> > Last November snowball added support for Greek language [1]. Following
> > the instructions [2], I wrote a patch that adds fulltext search for
> > Greek in Postgres. The patch is attached.
>
> I have committed a full sync from the upstream snowball repository,
> which pulled in the new greek stemmer.
>
> Could you please clarify where you got the stopword list from? The
> README says those need to be downloaded separately, but I wasn't able to
> find the download location. It would be good to document this, for
> example in the commit message. I haven't committed the stopword list yet.
>
Thank you Peter,
Here is the repo with the stop-words:
https://github.com/pmav99/greek_stopwords
The list is based on an earlier publication with modification by me. All
the relevant info is on github.
Disclaimer 1: The list has not been validated by an expert.
Disclaimer 2: There are more stop-words lists on the internet, but they are
less complete and they also use ancient greek words. Furthermore, my
testing showed that snowball needs to handle accents (tonous) and ς (teliko
sigma) in a special way if you want the stemmer to work with capitalized
words too.
https://github.com/Xangis/extra-stopwords/blob/master/greek
https://github.com/stopwords-iso/stopwords-el/tree/master/raw
all the best,
Panagiotis
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2019-07-09 14:20:10 | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS) |
Previous Message | Antonin Houska | 2019-07-09 13:47:44 | Re: [HACKERS] WIP: Aggregation push-down |