Re: Full text search on partial URLs

From: Zev Benjamin <zev(at)strangersgate(dot)com>
To: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Cc: bricklen <bricklen(at)gmail(dot)com>
Subject: Re: Full text search on partial URLs
Date: 2014-01-03 20:05:30
Message-ID: 52C7180A.7070602@strangersgate.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 11/15/2013 07:40 PM, Zev Benjamin wrote:
>
> One problem that I've run into here is that I would also like to
> highlight matched text in my application. For my existing search
> solution, I do this with ts_headline. For partial matches, it's
> unfortunately not just a matter of searching for the text and adding the
> appropriate markup because my documents are HTML (the FTS lexer
> helpfully pulls out all the HTML tags so it hasn't been a problem so
> far) and we don't want to accidentally "highlight" some of the
> attributes of the markup.
>
> One way to solve this would be if there were a way to turn a tsvector
> and tsquery pair into a list of the offsets and lengths of the lexemes
> that match. The highlighting could then be done at the application
> level rather than the database level while still leveraging Postgres's
> FTS functionality.

I've written C functions to implement this and attached them to this
email. The support files necessary for making a module are available at
https://github.com/zbenjamin/tsearch_extras. I'm new to the PostgreSQL
code base so any feedback or comments would be greatly appreciated.
Would these be appropriate to submit as patches to PostgreSQL?

Thanks,
Zev

Attachment Content-Type Size
tsearch_extras.c text/x-csrc 4.9 KB

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Zev Benjamin 2014-01-03 20:07:09 Re: Full text search on partial URLs
Previous Message Paul Jungwirth 2014-01-03 20:02:02 Re: Suddenly all tables were gone