Quick Links

Re: String Similarity

From:	"Mark Woodward" <pgsql(at)mohawksoft(dot)com>
To:	"Mark Dilger" <pgsql(at)markdilger(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: String Similarity
Date:	2006-05-19 21:10:23
Message-ID:	18219.24.91.171.78.1148073023.squirrel@mail.mohawksoft.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> Mark Woodward wrote:
>> I have a side project that needs to "intelligently" know if two strings
>> are contextually similar. Think about how CDDB information is collected
>> and sorted. It isn't perfect, but there should be enough information to
>> be
>> usable.
>>
>> Think about this:
>>
>> "pink floyd - dark side of the moon - money"
>> "dark side of the moon - pink floyd - money"
>> "money - dark side of the moon - pink floyd"
>> etc.
>>
>> To a human, these strings are almost identical. Similarly:
>>
>> "dark floyd of money moon pink side the"
>>
>> Is a puzzle to be solved by 13 year old children before the movie
>> starts.
[snip]
>
> Hmmm... I think I like this problem. Maybe I'll work on it a bit as a
> contrib
> module.

I *have* a working function, but it is not very efficient and it is not
what I would call numerically predictable. And it does find the various
sub-strings between the two strings in question.

Email me offline and we can make something for contrib.

In response to

Re: String Similarity at 2006-05-19 20:52:53 from Mark Dilger

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jim C. Nasby	2006-05-19 21:12:13	Re: PL/pgSQL 'i = i + 1' Syntax
Previous Message	Martijn van Oosterhout	2006-05-19 20:53:12	Re: [OT] MySQL is bad, but THIS bad?