Re: tsearch2 and hyphenated terms

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Reece Hart <reece(at)harts(dot)net>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: tsearch2 and hyphenated terms
Date: 2008-04-11 16:45:32
Message-ID: 13306.1207932332@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Reece Hart <reece(at)harts(dot)net> writes:
> For the purposes of indexing these names, I suspect I'd get the majority
> of cases by removing a hyphen when it's followed by 1 or 2 chars from
> [a-zA-Z0-9]. Does that require a custom parser?

Yeah, looks like it:

regression=# select * from ts_debug('MCL1 MCL-1');
alias | description | token | dictionaries | dictionary | lexemes
-----------+--------------------------+-------+----------------+--------------+---------
numword | Word, letters and digits | MCL1 | {simple} | simple | {mcl1}
blank | Space symbols | | {} | |
asciiword | Word, all ASCII | MCL | {english_stem} | english_stem | {mcl}
int | Signed integer | -1 | {simple} | simple | {-1}
(4 rows)

I had thought you might get a "numhword" output, but that only seems to
happen if there's at least one letter after the dash:

regression=# select * from ts_debug('MCL1 MCL-X1');
alias | description | token | dictionaries | dictionary | lexemes
-----------------+------------------------------------------+--------+----------------+--------------+----------
numword | Word, letters and digits | MCL1 | {simple} | simple | {mcl1}
blank | Space symbols | | {} | |
numhword | Hyphenated word, letters and digits | MCL-X1 | {simple} | simple | {mcl-x1}
hword_asciipart | Hyphenated word part, all ASCII | MCL | {english_stem} | english_stem | {mcl}
blank | Space symbols | - | {} | |
hword_numpart | Hyphenated word part, letters and digits | X1 | {simple} | simple | {x1}
(6 rows)

regards, tom lane

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Oleg Bartunov 2008-04-11 18:07:14 Re: tsearch2 and hyphenated terms
Previous Message Scott Marlowe 2008-04-11 16:44:52 Re: PostgreSQL Processes on a linux box