From: | Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>, PgSQL General ML <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Initial ugly reverse-translator |
Date: | 2008-04-19 17:10:38 |
Message-ID: | Pine.LNX.4.64.0804192110060.21547@sn.sai.msu.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Sat, 19 Apr 2008, Tom Lane wrote:
> Craig Ringer <craig(at)postnewspapers(dot)com(dot)au> writes:
>> Tom Lane wrote:
>>> I don't really see the problem. I assume from your reference to pg_trgm
>>> that you're using trigram similarity as the prefilter for potential
>>> matches
>
>> It turns out that's no good anyway, as it appears to ignore characters
>> outside the ASCII range. Rather less than useful for searching a
>> database of translated strings ;-)
>
> A quick look at the pg_trgm code suggests that it is only prepared to
> deal with single-byte encodings; if you're working in UTF8, which I
> suppose you'd have to be, it's dead in the water :-(. Perhaps fixing
> that should be on the TODO list.
as well as ltree. they are in our todo list:
http://www.sai.msu.su/~megera/wiki/TODO
>
> But in any case maybe the full-text-search stuff would be more useful
> as a prefilter? Although honestly, for the speed we need here, I'm
> not sure a prefilter is needed at all. Full text might be useful
> if a LIKE-based match fails, though.
>
>>> (And besides, speed doesn't seem like the be-all and end-all here.)
>
>> True. It's not so much the speed as the fragility when faced with small
>> changes to formatting. In addition to whitespace, some clients mangle
>> punctuation with features like automatic "curly"-quoting.
>
> Yeah. I was wondering whether encoding differences wouldn't be a huge
> problem in practice, as well.
>
> regards, tom lane
>
>
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru)
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
From | Date | Subject | |
---|---|---|---|
Next Message | Decibel! | 2008-04-19 17:41:41 | Re: No server after starting |
Previous Message | Tom Lane | 2008-04-19 16:38:13 | Re: Initial ugly reverse-translator |