From: | "Tomi NA" <hefest(at)gmail(dot)com> |
---|---|
To: | pgsql-general <pgsql-general(at)postgresql(dot)org> |
Subject: | full text search: the concept of a "word" |
Date: | 2006-04-20 22:20:22 |
Message-ID: | d487eb8e0604201520w3532cdd7saa2596046b5cbe51@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
I'm considering using tsearch2 in the project I'm working on right
now...however, I'm not sure if tsearch2 can handle my very specific
requirements - I therefore hope someone can tell me if the following
is possible and how I should go about it...
My textfields are trigger-generated using information from a number of
tables: these fields can be, say, a couple of thousand characters
wide.
Up to here, there's no problem.
What I'd like to do is define - possibly using regexps - what
constitutes a word. For instance, my word separator is a semicolon,
not a space; a dash is not a separator, and neither are language
specific characters (which might be interpreted that way by a language
agnostic tool)...
BTW, I use UTF-8 as my database encoding if it's of any importance.
What it comes down to is this: is it possible to somehow define what
constitutes a word?
TIA,
Tomislav
From | Date | Subject | |
---|---|---|---|
Next Message | Tomi NA | 2006-04-20 22:49:31 | setting the environment locale - linux, windows |
Previous Message | Tom Lane | 2006-04-20 21:03:04 | Re: sudo-like behavior |