BUG #18580: The pg_similarity appears to be wrong

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: bosamia(dot)karan(at)gmail(dot)com
Subject: BUG #18580: The pg_similarity appears to be wrong
Date: 2024-08-12 09:58:07
Message-ID: 18580-11fe381a9adde140@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 18580
Logged by: Karan Bosamia
Email address: bosamia(dot)karan(at)gmail(dot)com
PostgreSQL version: 14.10
Operating system: Ubuntu
Description:

SELECT *
FROM (
SELECT
*,
similarity(provision_clean_description, 'Policies The General Partner
shall promptly notify the Investor of any proposed changes in the Funds
leverage policies including adjustments to leverage ratios') AS sim_tim
FROM provision_database
) pd
WHERE sim_tim <= 1 and sim_tim > 0.7 and firm_id=18;

This both sentences giving similarity score as 1 despite the fact that the
sentence 1. has Policies as the starting word(do not include the starting
hyphen in the sentences):
- Policies The General Partner shall promptly notify the Investor of any
proposed changes in the Funds leverage policies including adjustments to
leverage ratios
- The General Partner shall promptly notify the Investor of any proposed
changes in the Funds leverage policies including adjustments to leverage
ratios

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Alvaro Herrera from 2ndQuadrant 2024-08-12 18:24:27 Re: BUG #18559: Crash after detaching a partition concurrently from another session
Previous Message Andre Mikulec 2024-08-12 09:01:38 REL_17_STABLE - meson test --suite setup --suite cube - fails for any/all CONTRIBs