Re: Fuzzy string matching of product names

From: Bill Moran <wmoran(at)potentialtech(dot)com>
To: Peter Geoghegan <peter(dot)geoghegan86(at)gmail(dot)com>
Cc: PGSQL Mailing List <pgsql-general(at)postgresql(dot)org>
Subject: Re: Fuzzy string matching of product names
Date: 2010-04-05 19:14:38
Message-ID: 20100405151438.127b8670.wmoran@potentialtech.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

In response to Peter Geoghegan <peter(dot)geoghegan86(at)gmail(dot)com>:

> Hello,
>
> At the moment, users of my application, which runs on 8.4.3, may
> search for products in a way that is implemented roughly like this:
>
> SELECT * FROM products WHERE description ILIKE '%%usr_string%%';
>
> This works reasonably well. However, I thought it would be a nice
> touch to give my users leeway to spell product names incorrectly when
> searching, or to not have to remember if a product is entered as "coca
> cola", "CocaCola" or "Coca-cola". At the moment, they don't have to
> worry about case sensitivity because I use ILIKE - I'd like to
> preserve that. I'd also like to not have it weigh against them heavily
> when they don't search for a specific product, but just a common
> substring. For example, if they search for "coca-cola", there may be a
> number of different coca-cola products: "CocaCola 330ml can",
> "Coca-Cola 2 litre bottle", but no actual plain "cocacola". That ought
> to not matter too much - all cocacola products should be returned.
>
> This isn't important enough for me to be willing to add a big
> dependency to my application. I'd really prefer to limit myself to the
> contrib modules. pg_trgm and fuzzystrmatch look very promising, but
> it's not obvious how I can use either to achieve what I want.
> Postgres's built-in regex support may have a role to play too.
>
> I can live with it not being indexable, because typically there are
> only tens of thousands of products in a production system.
>
> Could someone suggest an approach that is reasonably simple and
> reasonably generic ?

http://www.postgresql.org/docs/8.4/static/fuzzystrmatch.html

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Brian Modra 2010-04-05 19:23:34 Re: Fuzzy string matching of product names
Previous Message Peter Geoghegan 2010-04-05 19:10:08 Fuzzy string matching of product names