Re: Creating a custom email token parser for FTS

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: sfpug(at)postgresql(dot)org
Subject: Re: Creating a custom email token parser for FTS
Date: 2014-01-16 00:16:24
Message-ID: 52D724D8.9040108@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: sfpug

On 01/14/2014 06:09 PM, Brian Ghidinelli wrote:
>
> I've looked through the docs but I'm not clear on how, or if it's
> possible, to create a new token for FTS?
>
> Specifically I would like a variant of the email token that breaks it up
> into username, host, domain and drops the TLD so we can do partial
> searching. I'd like to change it so this doesn't fail:
>
> select to_tsvector('brian(at)hotmail(dot)com') @@ to_tsquery('hotmail')
>
> How can a new token be added? Or update an existing token?

Unfortunately, the token parser is hardcoded in the dictionary code; you
have to fork it to add your own tokens. That's not the way it should
be, but it's the way it is.

Also consider that email addresses can be MUCH more complex than the above.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

In response to

Responses

Browse sfpug by date

  From Date Subject
Next Message Brian Ghidinelli 2014-01-16 00:40:15 Re: Creating a custom email token parser for FTS
Previous Message Josh Berkus 2014-01-16 00:08:50 The uselessness of pgbouncer PAUSE