From: | Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Marcin(dot)Kasperski(at)mekk(dot)waw(dot)pl, pgsql-bugs(at)postgresql(dot)org, Teodor Sigaev <teodor(at)sigaev(dot)ru> |
Subject: | Re: BUG #6327: Prefix full-text-search fails for hosts with complicated names |
Date: | 2011-12-06 14:55:12 |
Message-ID: | Pine.LNX.4.64.1112061853230.7458@sn.sai.msu.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Mon, 5 Dec 2011, Tom Lane wrote:
> Marcin(dot)Kasperski(at)mekk(dot)waw(dot)pl writes:
>> Synopsis
>> =========
>
>> 'goog:*' matches google.com
>> but
>> 'e-goog:*' does not match e-google.com
>
> The reason for this seems to be that the pattern is treated as a
> hyphenated word:
>
> regression=# select TO_TSQUERY('english', 'e-goog:*');
> to_tsquery
> -------------------------------
> 'e-goog':* & 'e':* & 'goog':*
> (1 row)
>
> but the hostname isn't:
>
> regression=# select TO_TSVECTOR('english', 'See e-google.com');
> to_tsvector
> --------------------------
> 'e-google.com':2 'see':1
> (1 row)
>
> If you change the text so it's not recognized as a hostname, you get
> lexemes that would match the query:
>
> regression=# select TO_TSVECTOR('english', 'See e-google com');
> to_tsvector
> ---------------------------------------------
> 'com':5 'e':3 'e-googl':2 'googl':4 'see':1
> (1 row)
>
> Possibly we could fix this by hacking the ts parser so that it would
> also apply the hyphenated-word rules to a hostname containing a dash.
>
> In general though, there are always going to be cases where prefix
> match doesn't work because of dictionary transformations ...
I'd index 'after dictionary transformations' lexemes as well as an
original to let prefix march always work.
>
> regards, tom lane
>
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru)
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
From | Date | Subject | |
---|---|---|---|
Next Message | eikenberg | 2011-12-06 15:08:55 | BUG #6328: Wrong error message for insert-sql |
Previous Message | Daniel Migowski | 2011-12-06 14:51:24 | Re: [BUGS] BUG #6325: Useless Index updates |