Quick Links

Re: Fuzzy substring searching with the pg_trgm extension

From:	Teodor Sigaev <teodor(at)sigaev(dot)ru>
To:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Fuzzy substring searching with the pg_trgm extension
Date:	2016-01-29 14:15:18
Message-ID:	56AB73F6.7050200@sigaev.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> The behavior of this function is surprising to me.
>
> select substring_similarity('dog' , 'hotdogpound') ;
>
> substring_similarity
> ----------------------
> 0.25
>
Substring search was desined to search similar word in string:
contrib_regression=# select substring_similarity('dog' , 'hot dogpound') ;
substring_similarity
----------------------
0.75

contrib_regression=# select substring_similarity('dog' , 'hot dog pound') ;
substring_similarity
----------------------
1
It seems to me that users search words in long string. But I'm agree that more
detailed explanation needed and, may be, we need to change feature name to
fuzzywordsearch or something else, I can't imagine how.

>
> Also, should we have a function which indicates the position in the
> 2nd string at which the most similar match to the 1st argument occurs?
>
> select substring_similarity_pos('dog' , 'hotdogpound') ;
>
> answering: 4
Interesting, I think, it will be useful in some cases.

>
> We could call them <<-> and <->> , where the first corresponds to <%
> and the second to %>
Agree
--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

In response to

Re: Fuzzy substring searching with the pg_trgm extension at 2016-01-15 21:07:39 from Jeff Janes

Responses

Re: Fuzzy substring searching with the pg_trgm extension at 2016-01-29 14:20:46 from Artur Zakirov
Re: Fuzzy substring searching with the pg_trgm extension at 2016-01-29 15:39:51 from Alvaro Herrera
Re: Fuzzy substring searching with the pg_trgm extension at 2016-02-25 22:00:51 from Jeff Janes

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Artur Zakirov	2016-01-29 14:20:46	Re: Fuzzy substring searching with the pg_trgm extension
Previous Message	Petr Jelinek	2016-01-29 14:11:21	Re: Sequence Access Method WIP