Re: Text search lexer's handling of hyphens and negatives

From: "Daniel Verite" <daniel(at)manitou-mail(dot)org>
To: "raylu" <lurayl(at)gmail(dot)com>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Text search lexer's handling of hyphens and negatives
Date: 2019-10-16 11:38:45
Message-ID: abf85423-ce4b-45ec-97a4-2789a400ed55@manitou-mail.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

raylu wrote:

> to_tsvector('simple', 'UVW-789-XYZ') is
> 'uvw':1 '-789':2 'xyz':3
> because -789 is a negative integer. If we turn the query '789-XYZ'
> into the tsquery as before, we get to_tsquery('simple', '789 <-> xyz')
> which doesn't match it.
>
> Are we missing something here? Is there either a way to
> 1. generate tsvectors without this special (negative) integer behavior or

As an ad-hoc solution, you could add a dictionary that turns a negative
integer into its positive counterpart. There's a dictionary in contrib that
can be used as a starting point:
https://www.postgresql.org/docs/current/dict-int.html

It's a matter of ~10 lines of C code to add an "abs" parameter to
that dictionary that would, when true, produce "789" as a lexem
when fed "-789" as input.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Thomas Kellerer 2019-10-16 12:03:06 A little confusion about JSON Path
Previous Message Francisco Olarte 2019-10-16 10:33:13 Re: Is there any configuration in postgresql.conf or any other configuration of postgres which will make this possible to listen on particular interface