Re: tsearch2 dictionary for statute cites

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>
Cc: <pgsql-general(at)postgresql(dot)org>,"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: tsearch2 dictionary for statute cites
Date: 2009-04-07 18:28:41
Message-ID: 49DB5509.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> wrote:
> contrib/test_parser - an example parser code.

Using that as a template, I seem to be on track to use the regexp.c
code to pick out statute cites from the text in my start function, and
recognize when I'm positioned on one in my getlexeme (GETTOKEN)
function, delegating everything before, between, and after statute
cites to the default parser. (I really didn't want to copy/paste and
modify the whole default parser.)

That leaves one question I'm still pretty fuzzy on -- how do I go
about having a statute cite in a tsquery match the entire statute cite
from a tsvector, or delimited leading portions of it, without having
it match shorter portions?

For example:

If the document text contains '341.15(3)' I want to find it with a
search string of '341', '341.15', '341.15(3)' but not '341.15(3)(b)',
'341.1', or '15'. How do I handle that? Do I have to build my
tsquery values myself as text and cast to tsquery, or is there
something more graceful that I'm missing?

-Kevin

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Oleg Bartunov 2009-04-07 18:48:49 Re: tsearch2 dictionary for statute cites
Previous Message David E. Wheeler 2009-04-07 16:05:59 Re: [HACKERS] string_to_array with empty input