From: | Szymon Guz <mabewlun(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Janek Sendrowski <janek12(at)web(dot)de>, PostgreSQL <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Levenshtein Distance with more than 255 characters |
Date: | 2013-09-06 07:30:36 |
Message-ID: | CAFjNrYv8xPYSDPEPujO_9pQoegE5VJjDyroCtfdbwDrdOSxoCA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 6 September 2013 08:47, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Szymon Guz <mabewlun(at)gmail(dot)com> writes:
> > On 6 September 2013 01:00, Janek Sendrowski <janek12(at)web(dot)de> wrote:
> >> I'm searching for an optimized Levenshtein Distance like Postgresql's.
> My
> >> problem is that I want to compare strings with a length over 255
> characters.
> >> Does anyone know a solution?
>
> > I'm not sure there is anything different from what you've found in
> > core/contribs. But you can always use pg/plpython or pg/plperl procedure
> > with some external library calculating the distance.
>
> Well, you could just rebuild the fuzzystrmatch module with a different
> value for MAX_LEVENSHTEIN_STRLEN. The comments in the code note that the
> comparison cost is roughly O(N^2) in the string length, and the reason for
> having a limit at all is to ensure the function runtime doesn't get out of
> hand --- but it seems likely to me that 255 is an unnecessarily
> conservative limit. If you wanted to do a few tests and report back on
> just how slow it can get, we might be persuaded to raise the stock
> setting.
>
> regards, tom lane
>
I've checked that and I think we could raise the limit without any problem
I've set
#define MAX_LEVENSHTEIN_STRLEN 255 * 255
I was using the levenshtein function comparing two strings of the same
length
Strings length: 640
Difference: 531
Time: 2.5ms
Strings length: 1920
Difference: 1258
Time: 20ms
Strings length: 5760
Difference: 1811
Time: 146ms
regards,
Szymon
From | Date | Subject | |
---|---|---|---|
Next Message | Boszormenyi Zoltan | 2013-09-06 07:57:34 | Re: Is this a bug in ECPG? |
Previous Message | Albe Laurenz | 2013-09-06 07:04:40 | Re: How to check if any WAL file is missing in archive folder |