From: | "Guillaume Smet" <guillaume(dot)smet(at)gmail(dot)com> |
---|---|
To: | "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: n-gram search function |
Date: | 2007-02-19 10:29:08 |
Message-ID: | 1d4e0c10702190229k36a2e1bbi6398899f810113bb@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2/19/07, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> wrote:
> pg_trgm was developed for spelling corrrection and there is a threshold of
> similarity, which is 0.3 by default. Readme explains what does it means.
Yes, I read it.
> Similarity could be very low, since you didn't make separate column and length
> of the full string is used to normalize similarity.
Yep, that's probably my problem. Ignored records are a bit longer than
the others.
I tried the tip in README.pg_trgm to generate a table with all the words.
It can do the work in conjunction of tsearch2 and a bit of AJAX to
suggest the full words to the users. The reason why I was not using
tsearch2 is that it's sometimes hard to spell location names
correctly.
The only problem is that it is still quite slow on a 50k rows words
table but I'll make further tests on a decent server this afternoon.
--
Guillaume
From | Date | Subject | |
---|---|---|---|
Next Message | Gregory Stark | 2007-02-19 10:51:47 | Short varlena headers and arrays |
Previous Message | Dimitri Fontaine | 2007-02-19 10:25:41 | Multiple Storage per Tablespace, or Volumes |