Re: Query taking long with levenshtein function

From: "Edward J(dot) Sabol" <edwardjsabol(at)gmail(dot)com>
To: pgsql-admin <pgsql-admin(at)lists(dot)postgresql(dot)org>
Cc: Teju Jakkidi vlogs <teja(dot)jakkidi05(at)gmail(dot)com>
Subject: Re: Query taking long with levenshtein function
Date: 2023-03-01 06:57:11
Message-ID: 2669E894-DC6E-4D33-ABAB-6E7C6D861F1E@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Feb 28, 2023, at 7:22 PM, Edward J. Sabol <edwardjsabol(at)gmail(dot)com> wrote:





On Feb 28, 2023, at 6:31 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:



Teju Jakkidi vlogs <teja(dot)jakkidi05(at)gmail(dot)com> writes:


We have a query as below which uses the levenshtein function to compare



strings.


SELECT "name", levenshtein("name",'some string') as p FROM table1



where levenshtein("name",'some string')   <= 2 order by p desc;


We have a GIST index built on top of this table as below:



CREATE INDEX gist_idx ON table1 USING GIST("name");

AFAIK, that index is completely useless for this query.  I don't

really see a good way to index it either --- "levenshtein distance

less than X" seems like a not very tractable requirement.  Can

you formulate your matching rules some other way?



Can you use trigram similarity instead? That has index support...

Also, check out this URL which shows how to use soundex(), which you can create a functional index on, with levenshtein(): 


















Fuzzy Name Matching in Postgres


crunchydata.com










At least in that blog's example, soundex() combined with levenshtein() achieved the same query result as using only levenshtein() in one hundredth of the time.

Hope this helps,

Ed

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Govardhan .G 2023-03-01 07:48:20 Forcasting in Postgresql
Previous Message Edward J. Sabol 2023-03-01 00:22:08 Re: Query taking long with levenshtein function