| From: | Teodor Sigaev <teodor(at)sigaev(dot)ru> |
|---|---|
| To: | Magnus Hagander <magnus(at)hagander(dot)net> |
| Cc: | richardcraig <richard(at)v3fm(dot)com>, pgsql-general <pgsql-general(at)postgresql(dot)org> |
| Subject: | Re: to_tsvector in 8.2.3 |
| Date: | 2007-03-21 18:13:55 |
| Message-ID: | 460175E3.40601@sigaev.ru |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-general |
> postgres=# select to_tsvector('test text');
> to_tsvector
> ---------------
> 'test text':1
> (1 row)
Ok. that's related to
http://developer.postgresql.org/cvsweb.cgi/pgsql/contrib/tsearch2/wordparser/parser.c.diff?r1=1.11;r2=1.12;f=h
commit. Thomas pointed that it can be non-breakable space (0xa0) and that commit
assumes any character with C locale and multibyte encoding and > 0x7f is alpha.
To check theory, pls, apply attached patch.
If so, I'm confused, we can not assume that 0xa0 is a space symbol in any
multibyte encoding, even in Windows.
--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/
| Attachment | Content-Type | Size |
|---|---|---|
| nonbreak.patch | text/plain | 609 bytes |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Travis | 2007-03-21 18:31:21 | Re: best way to kill long running query? |
| Previous Message | Tom Lane | 2007-03-21 17:51:22 | Re: best way to kill long running query? |