From: | Cristiano Coelho <cristianocca(at)hotmail(dot)com> |
---|---|
To: | "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org> |
Subject: | pg_trgm word_similarity inconsistencies or bug |
Date: | 2017-10-27 18:48:08 |
Message-ID: | CY4PR17MB13207ED8310F847CF117EED0D85A0@CY4PR17MB1320.namprd17.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
Hello all, this is related to postgres 9.6 (9.6.4) and a good description can be found here https://stackoverflow.com/questions/46966360/postgres-word-similarity-not-comparing-words
But in summary, word_similarity doesn’t seem to do exactly what the docs say, since it will match trigrams from multiple words rather tan doing a word by word comparison.
Below is a table with output and expected output, thanks to kiln from stackoverflow to provide it.
with data(t) as (
values
('message'),
('message s'),
('message sag'),
('message sag sag'),
('message sag sage')
)
select t, word_similarity('sage', t), my_word_similarity('sage', t)
from data;
t | word_similarity | my_word_similarity
------------------+-----------------+--------------------
message | 0.6 | 0.3
message s | 0.8 | 0.3
message sag | 1 | 0.5
message sag sag | 1 | 0.5
message sag sage | 1 | 1
From | Date | Subject | |
---|---|---|---|
Next Message | Jordan Lewis | 2017-10-27 21:05:26 | Re: ORDER BY $1 behaves inconsistently |
Previous Message | Tom Lane | 2017-10-27 18:33:12 | Re: ORDER BY $1 behaves inconsistently |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2017-10-27 18:54:32 | Re: Index only scan for cube and seg |
Previous Message | Tom Lane | 2017-10-27 18:15:30 | ALTER COLUMN TYPE vs. domain constraints |