From: | Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp> |
---|---|
To: | teodor(at)sigaev(dot)ru |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: string_to_array eats too much memory? |
Date: | 2006-11-08 15:11:34 |
Message-ID: | 20061109.001134.119869740.t-ishii@sraoss.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> > Porblem with Japanese is, it's an agglutinative language and we need
> > to separate each word from a sentence. So, I need to modify tsearch2
> > anyway (I know someone from Japan is working on this).
> https://www.oss.ecl.ntt.co.jp/tsearch2j/index.html
> That's it?
Yes. However I'm going to use different "word separation" library from
them and will make some tweaks.
> > BTW, can tsearch2 handle ~70k words in a document?
>
> I don't see any problem.
Great. I have made a little trial and it seems tsearch2 works great
with GIN.
> tsvector size should not be greater than 1Mb however.
Is this documented somewhere? Also I noticed that tsearch2 treats ":"
as a special character. Are there any special characters? If so where
are they documented?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
From | Date | Subject | |
---|---|---|---|
Next Message | Teodor Sigaev | 2006-11-08 15:50:11 | Re: string_to_array eats too much memory? |
Previous Message | Tom Lane | 2006-11-08 14:56:03 | Re: string_to_array eats too much memory? |