Quick Links

Re: Text search lexer's handling of hyphens and negatives

From:	Alan Hodgson <ahodgson(at)lists(dot)simkin(dot)ca>
To:	pgsql-general(at)lists(dot)postgresql(dot)org
Subject:	Re: Text search lexer's handling of hyphens and negatives
Date:	2019-10-16 15:50:59
Message-ID:	45772fae97692f848fa102b0ddb92ad0c6d6b5bc.camel@lists.simkin.ca
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Tue, 2019-10-15 at 20:34 -0700, raylu wrote:
> On Tue, Oct 15, 2019 at 3:35 PM Alan Hodgson <ahodgson(at)lists(dot)simkin(dot)ca
> > wrote:
> > My company has found the pg_trm extension to be more useful for
> > partial text searches than the full text functions. I don't know
> > specifically how it might help with your hyphens but it would be
> > worth testing. The docs actually suggest using them in conjunction
> > in some cases.
>
> We actually do use pg_trgm already for the names/titles of
> things.Indexing the content with a trigram index and then
> doingLOWER(content) LIKE '%789-xyz%' would certainly work, but1. we'd
> have to do a little bit of finagling if we wanted to match onword
> boundaries (don't match '6789-xyza' in the above example)2. trigram
> indexes are pretty huge for long documents, which is why wecurrently
> only use them for names/titles
> We may give up and just use pg_trgm for contents if nothing else
> worksout but it feels like the text search lexer is _so_ close to what
> wewant.

Maybe you could have a trigger pull out those specific hypenated
references into a separate column when the document is added or updated,
and store/index those separately?

In response to

Re: Text search lexer's handling of hyphens and negatives at 2019-10-16 03:34:53 from raylu

Browse pgsql-general by date

	From	Date	Subject
Next Message	Alexander Pyhalov	2019-10-16 16:29:55	PostgreSQL memory usage
Previous Message	Adrian Klaver	2019-10-16 14:45:39	Re: Analyze and vaccum