From: | Lincoln Yeoh <lyeoh(at)pop(dot)jaring(dot)my> |
---|---|
To: | Mike Chamberlain <mikeachamberlain(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: Full text search in Chinese |
Date: | 2010-10-26 18:05:47 |
Message-ID: | 20101026180649.B99F41336DB6@mail.postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
At 11:42 AM 10/25/2010, Mike Chamberlain wrote:
>Has anyone implemented FTS in Chinese on PG? Â I
>guess I need a Chinese ispell dictionary and
>parser, neither of which I can find after a lot of googling.
>
>I have a bounty on this question on Stackoverflow if anyone wants to claim it:
>
><http://stackoverflow.com/questions/3994504/how-do-i-implement-full-text-search-in-chinese-on-postgresql>http://stackoverflow.com/questions/3994504/how-do-i-implement-full-text-search-in-chinese-on-postgresql
>
>Thanks,
>
>Mike
What sort of usage would you be expecting? e.g. search terms.
Written chinese is a character based language,
not an alphabet style language. To complicate
things a bit, there are two main character sets-
Traditional Chinese and Simplified Chinese.
Chinese characters would be the equivalent of an
English keyword. But lots of "words"/"meanings"
would require two or more characters. You might
be able to handle this similar to the way english
phrases are handled (indexed and searched for),
after all "bee's knees" usually means a different
thing from the actual bee's knees.
Japanese on the other hand, has _three_ main
scripts. Two for "alphabet style", and one "chinese character style"...
Regards,
Link.
From | Date | Subject | |
---|---|---|---|
Next Message | Steeles | 2010-10-26 18:27:01 | What is better method to backup postgresql DB. |
Previous Message | Alan Hodgson | 2010-10-26 17:33:51 | Re: Why Select Count(*) from table - took over 20 minutes? |