BUG #18149: Incorrect lexeme for english token "proxy"

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: pperalta(at)gmail(dot)com
Subject: BUG #18149: Incorrect lexeme for english token "proxy"
Date: 2023-10-05 21:44:27
Message-ID: 18149-936d14e6fc76ca61@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 18149
Logged by: Patrick Peralta
Email address: pperalta(at)gmail(dot)com
PostgreSQL version: 14.5
Operating system: Linux
Description:

The english dictionary is using the lexeme "proxi" for the token "proxy". As
a result, the search term "proxy" is not yielding results for records that
contain this word.

# select * from ts_debug('english', 'proxy');
alias | description | token | dictionaries | dictionary |
lexemes
-----------+-----------------+-------+----------------+--------------+---------
asciiword | Word, all ASCII | proxy | {english_stem} | english_stem |
{proxi}

I think this lexeme was chosen to support the plural of proxy which is
proxies. However there are other plurals where the root word is spelled
different and Postgres creates the correct lexeme such as:

# select * from ts_debug('english', 'goose');
alias | description | token | dictionaries | dictionary |
lexemes
-----------+-----------------+-------+----------------+--------------+---------
asciiword | Word, all ASCII | goose | {english_stem} | english_stem |
{goos}

# select * from ts_debug('english', 'mouse');
alias | description | token | dictionaries | dictionary |
lexemes
-----------+-----------------+-------+----------------+--------------+---------
asciiword | Word, all ASCII | mouse | {english_stem} | english_stem |
{mous}

I believe we can create our own dictionary as a workaround
(https://www.postgresql.org/docs/current/textsearch-dictionaries.html) but
I'm reporting this to see if using "proxi" for "proxy" is intentional.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Peter Smith 2023-10-06 01:13:18 Re: [16+] subscription can end up in inconsistent state
Previous Message Thomas Munro 2023-10-05 20:18:24 Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows