From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Marcin(dot)Kasperski(at)mekk(dot)waw(dot)pl |
Cc: | pgsql-bugs(at)postgresql(dot)org, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Teodor Sigaev <teodor(at)sigaev(dot)ru> |
Subject: | Re: BUG #6327: Prefix full-text-search fails for hosts with complicated names |
Date: | 2011-12-05 15:38:04 |
Message-ID: | 27803.1323099484@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Marcin(dot)Kasperski(at)mekk(dot)waw(dot)pl writes:
> Synopsis
> =========
> 'goog:*' matches google.com
> but
> 'e-goog:*' does not match e-google.com
The reason for this seems to be that the pattern is treated as a
hyphenated word:
regression=# select TO_TSQUERY('english', 'e-goog:*');
to_tsquery
-------------------------------
'e-goog':* & 'e':* & 'goog':*
(1 row)
but the hostname isn't:
regression=# select TO_TSVECTOR('english', 'See e-google.com');
to_tsvector
--------------------------
'e-google.com':2 'see':1
(1 row)
If you change the text so it's not recognized as a hostname, you get
lexemes that would match the query:
regression=# select TO_TSVECTOR('english', 'See e-google com');
to_tsvector
---------------------------------------------
'com':5 'e':3 'e-googl':2 'googl':4 'see':1
(1 row)
Possibly we could fix this by hacking the ts parser so that it would
also apply the hyphenated-word rules to a hostname containing a dash.
In general though, there are always going to be cases where prefix
match doesn't work because of dictionary transformations ...
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Euler Taveira de Oliveira | 2011-12-05 16:28:37 | Re: BUG #6327: Prefix full-text-search fails for hosts with complicated names |
Previous Message | Euler Taveira de Oliveira | 2011-12-05 15:02:24 | Re: BUG #6327: Prefix full-text-search fails for hosts with complicated names |