Re: question about to_tsvector and to_tsquery

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martin Norbäck Olivers <martin(at)norpan(dot)org>
Cc: pgsql-sql(at)lists(dot)postgresql(dot)org
Subject: Re: question about to_tsvector and to_tsquery
Date: 2021-08-24 14:12:51
Message-ID: 334015.1629814371@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

=?UTF-8?Q?Martin_Norb=C3=A4ck_Olivers?= <martin(at)norpan(dot)org> writes:
> Is there any more information on exactly how to_tsquery and to_tsvector are
> supposed to work?

> select to_tsvector('simple', '1.b') gives '1':1 'b':2
> but
> select to_tsvector('simple', '1.bb') gives '1.bb':1

ts_debug gives a little bit of insight:

postgres=# select * from ts_debug('simple', '1.b');
alias | description | token | dictionaries | dictionary | lexemes
-----------+------------------+-------+--------------+------------+---------
uint | Unsigned integer | 1 | {simple} | simple | {1}
blank | Space symbols | . | {} | |
asciiword | Word, all ASCII | b | {simple} | simple | {b}
(3 rows)

postgres=# select * from ts_debug('simple', '1.bb');
alias | description | token | dictionaries | dictionary | lexemes
-------+-------------+-------+--------------+------------+---------
host | Host | 1.bb | {simple} | simple | {1.bb}
(1 row)

I don't know the exact rules that cause classification of something
as a "host" token. It does seem a little weird that length matters.

regards, tom lane

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message David G. Johnston 2021-08-24 14:15:29 Re: Partition by outer join
Previous Message Martin Norbäck Olivers 2021-08-24 12:01:23 question about to_tsvector and to_tsquery