Quick Links

Re: Fuzzy substring searching with the pg_trgm extension

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Fuzzy substring searching with the pg_trgm extension
Date:	2016-01-29 15:39:51
Message-ID:	20160129153951.GA773484@alvherre.pgsql
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Teodor Sigaev wrote:
> >The behavior of this function is surprising to me.
> >
> >select substring_similarity('dog' , 'hotdogpound') ;
> >
> > substring_similarity
> >----------------------
> > 0.25
> >
> Substring search was desined to search similar word in string:
> contrib_regression=# select substring_similarity('dog' , 'hot dogpound') ;
> substring_similarity
> ----------------------
> 0.75
>
> contrib_regression=# select substring_similarity('dog' , 'hot dog pound') ;
> substring_similarity
> ----------------------
> 1

Hmm, this behavior looks too much like magic to me. I mean, a substring
is a substring -- why are we treating the space as a special character
here?

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Re: Fuzzy substring searching with the pg_trgm extension at 2016-01-29 14:15:18 from Teodor Sigaev

Responses

Re: Fuzzy substring searching with the pg_trgm extension at 2016-01-29 15:58:39 from Artur Zakirov
Re: Fuzzy substring searching with the pg_trgm extension at 2016-02-11 12:30:44 from Teodor Sigaev

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Aleksander Alekseev	2016-01-29 15:47:33	Re: [WIP] Effective storage of duplicates in B-tree index.
Previous Message	Alvaro Herrera	2016-01-29 15:36:46	Re: Sequence Access Method WIP