ts_headline and query with hyphen

From: daniel <dochtorek(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: ts_headline and query with hyphen
Date: 2012-12-05 03:31:35
Message-ID: 50BEC017.7070507@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi

I have a question about ts_headline, when the query includes word like
'on-line' - only the 'line' part is highlighted, even though the whole
phrase is indexed too, some details below.

Postgresql 9.1.6

select
token, dictionary, lexemes
from
ts_debug('play on-line') where alias <> 'blank';

token | dictionary | lexemes
---------+--------------+----------
play | english_stem | {play}
on-line | english_stem | {on-lin}
on | english_stem | {}
line | english_stem | {line}

select to_tsquery('play & on-line');
to_tsquery
----------------------------
'play' & 'on-lin' & 'line'

select ts_headline('play on-line', to_tsquery('play & on-line'));

ts_headline
----------------------------
<b>play</b> on-<b>line</b>

Same as

select ts_headline('play on-line', to_tsquery('play & line'));
ts_headline
----------------------------
<b>play</b> on-<b>line</b>

Is that the intended behaviour? I guess the problem here is that 'on' is
not a lexem, but then what about on-lin?

In another example, I thought that a hyphenated match would have some
kind of preference

select token, dictionary, lexemes from ts_debug('custom-built query')
where alias <> 'blank';
token | dictionary | lexemes
--------------+--------------+----------------
custom-built | english_stem | {custom-built}
custom | english_stem | {custom}
built | english_stem | {built}
query | english_stem | {queri}

select to_tsquery('query & custom-built');
to_tsquery
-----------------------------------------------
'queri' & 'custom-built' & 'custom' & 'built'

select ts_headline('custom-built query', to_tsquery('query &
custom-built'));
ts_headline
-----------------------------------------
<b>custom</b>-<b>built</b> <b>query</b>

This works better, but still both parts of 'custom-built' are
highlighted separately. But maybe ts_headline understands or operates on
single, not hyphenated words only?

thanks
daniel

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2012-12-05 03:49:21 Re: ts_headline and query with hyphen
Previous Message Gauthier, Dave 2012-12-05 03:12:46 how do I grant select to one user for all tables in a DB?