Quick Links

Re: Merge rows based on Levenshtein distance

From:	David G Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>
To:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Merge rows based on Levenshtein distance
Date:	2014-12-02 00:49:41
Message-ID:	1417481381890-5828847.post@n5.nabble.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

mongoose wrote
> I am new to PostgreSQL and I have the following table:
>
> Name, City
> "Alex", "Washington"
> "Aleex1", "Washington"
> "Bob", "NYC"
> "Booob", "NYC"
>
> I want to "merge" similar rows based on levenshtein distance between names
> so that I have the following table:
>
> id, Name, City
> 1,"Alex", "Washington"
> 1,"Aleex1", "Washington"
> 2,"Bob", "NYC"
> 2,"Booob", "NYC"
>
> How could I do that on PostgreSQL? Is there an SQL command for this?
> Thnsls

So you have a table of N names and you want to evaluate (N-1)^2 pairs and
then use the output of the levenshtein calculation to group them together.

SELECT
l_names.name_value,
r_names.name_value, leven[...](l_names.name_value, r_names.name_value) AS
pair_group
FROM table_of_names AS l_names
CROSS JOIN table_of_names AS r_names
WHERE l_names.name_value <> r_names.name_value
;

Feel free to add "group by city" or "WHERE substring(l_names.name_value, 0,
1) = substring(r_names.name_value, 0, 1)" since it seems you need more than
just a name-distance to generate the desired groups. You'd likely want to
add the same "substring" call to the SELECT-list and "GROUP BY" clauses...

David J.

--
View this message in context: http://postgresql.nabble.com/Merge-rows-based-on-Levenshtein-distance-tp5828841p5828847.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

In response to

Merge rows based on Levenshtein distance at 2014-12-01 23:48:41 from mongoose

Responses

Re: Merge rows based on Levenshtein distance at 2014-12-03 05:05:53 from mongoose

Browse pgsql-general by date

	From	Date	Subject
Next Message	Bryn Jeffries	2014-12-02 00:50:58	Re: Irreversible SET ROLE
Previous Message	Tom Lane	2014-12-02 00:39:33	Re: Irreversible SET ROLE