Quick Links

Re: Approximate string matching?

From:	"Josh Berkus" <josh(at)agliodbs(dot)com>
To:	"Joshua b(dot) Jore" <josh(at)greentechnologist(dot)org>
Cc:	pgsql-novice(at)postgresql(dot)org
Subject:	Re: Approximate string matching?
Date:	2002-03-20 23:07:58
Message-ID:	web-834815@davinci.ethosmedia.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-novice

Joshua,

This is *not* a novice question. I'm not sure where else you'd post it
though.

> Ok, the basic question: does anyone have any approximate string
> matching
> algorithms coded such that PostgreSQL can use it effeciently? I would
> like
> to handle inserts/deletes. I already have a perl and LotusScript
> (that's
> for Domino) implementation but I haven't ever been able to get the
> perl
> module to install right with PostgreSQL.

Metaphone, Soundex, and Levenshtein were built for postgresql by Joe
Conway. Find them in the /contrib directory.

> Translations:
> Wu-Manber k-differences: it's an algorithm that measures how many
> edits
> are required to turn one string into another. k is the number of
> edits.
> This is also known as the Levenschtein distance. I'm getting this
> from the
> Perl Algorithm book.

Levenschtien is available in /contrib. It works well for the database
I use it on; though that only has 7000 records, so you'll have to test
really large tables.

If you're deduplicating, I wrote a sophisticated name-alike function
using Levenschtein and Metaphone in PL/pgSQL and posted it to Roberto
Mello's function library (accessable from TechDocs).

-Josh Berkus

In response to

Re: Approximate string matching? at 2002-03-20 23:07:51 from Joshua b. Jore

Responses

What is object-relational? at 2002-03-21 21:21:23 from Joshua b. Jore

Browse pgsql-novice by date

	From	Date	Subject
Next Message	Daniel Grob	2002-03-21 08:47:37	rules over multiple tables
Previous Message	Joshua b. Jore	2002-03-20 23:07:51	Re: Approximate string matching?