Similarity Search with Wildcards

From: Ghislain Hachey <ghachey(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Similarity Search with Wildcards
Date: 2013-02-28 06:35:37
Message-ID: 512EFAB9.80908@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi list,

I have a varchar column with content such as "Client Name - Brief
Description of Problem" (it's a help desk ticket system). I want to
generate reports by clients and the only thing I can base my query on is
this column. The client names often contain typos or are entered
slightly differently. I installed the pg_trgm extension and it almost
does what I want. The problem is that it searches the similarity of the
whole field and not just the client name resulting in not so similar
searches (I include my query below).

SELECT
tickets.id as ticket_id,
tickets.subject as ticket_subject,
similarity(tickets.subject, 'Client Name') AS sml,
FROM
tickets
WHERE
tickets.subject % 'Client Name';

I thought about using wildcards as discussed here
<http://www.postgresql.org/message-id/flat/4D3CC2DC(dot)6060002(at)wulczer(dot)org#4D3CC2DC(dot)6060002@wulczer.org>
but this does not seem to have any effect (I include the query I tried
below).

SELECT
tickets.id as ticket_id,
tickets.subject as ticket_subject,
similarity(tickets.subject, '%Client Name%') AS sml,
FROM
tickets
WHERE
tickets.subject % '%Client Name%';

Both queries result in the same similarity. I would hope that the
similarity algorithm would only work on the "Client Name" part of the
string and ignore what is before and after; in other words, the latter
query above would return a similarity factor of 1 on the content "Client
Name - Brief Description of Problem".

Any pointer in a right direction would be appreciated.

--
GH<www.ghachey.info>

Responses

Browse pgsql-general by date

  From Date Subject
Next Message John R Pierce 2013-02-28 07:12:15 Re: Similarity Search with Wildcards
Previous Message Merlin Moncure 2013-02-28 06:18:05 Re: Poor performance when using a window function in a view