From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Martin Norbäck Olivers <martin(at)norpan(dot)org> |
Cc: | pgsql-sql(at)lists(dot)postgresql(dot)org |
Subject: | Re: question about to_tsvector and to_tsquery |
Date: | 2021-08-24 14:12:51 |
Message-ID: | 334015.1629814371@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
=?UTF-8?Q?Martin_Norb=C3=A4ck_Olivers?= <martin(at)norpan(dot)org> writes:
> Is there any more information on exactly how to_tsquery and to_tsvector are
> supposed to work?
> select to_tsvector('simple', '1.b') gives '1':1 'b':2
> but
> select to_tsvector('simple', '1.bb') gives '1.bb':1
ts_debug gives a little bit of insight:
postgres=# select * from ts_debug('simple', '1.b');
alias | description | token | dictionaries | dictionary | lexemes
-----------+------------------+-------+--------------+------------+---------
uint | Unsigned integer | 1 | {simple} | simple | {1}
blank | Space symbols | . | {} | |
asciiword | Word, all ASCII | b | {simple} | simple | {b}
(3 rows)
postgres=# select * from ts_debug('simple', '1.bb');
alias | description | token | dictionaries | dictionary | lexemes
-------+-------------+-------+--------------+------------+---------
host | Host | 1.bb | {simple} | simple | {1.bb}
(1 row)
I don't know the exact rules that cause classification of something
as a "host" token. It does seem a little weird that length matters.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | David G. Johnston | 2021-08-24 14:15:29 | Re: Partition by outer join |
Previous Message | Martin Norbäck Olivers | 2021-08-24 12:01:23 | question about to_tsvector and to_tsquery |