From: | Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> |
---|---|
To: | Ben <bench(at)silentmedia(dot)com> |
Cc: | Teodor Sigaev <teodor(at)sigaev(dot)ru>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: making tsearch2 dictionaries |
Date: | 2004-02-17 11:15:57 |
Message-ID: | Pine.GSO.4.58.0402171337160.3452@ra.sai.msu.su |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Mon, 16 Feb 2004, Ben wrote:
> So I noticed. ;) The dictionary's working, and I'd be happy to expand
> upon the documentation. Just point me at something to work on.
>
I think you may just write a paper "How I did custom dictionary for tsearch2".
>From what I've read I see your dictionary could be interesting to people
especially if you describe the motivation and usage.
Do you want '100' or 'hundred' will be fully equivalent ? So,
if you search '100' you will find document with 'hundred'. Interesting,
that you will find '123', because '123' will be 'one hundred twenty three'.
> But, like I said, I really want to figure out a way to pipe the output
> of my dictionary through the another dictionary. If I can't do that, it
> doesn't seem as useful, because "100" (handled by my dictionary) and
> "one hundred" (handled by en_stem) currently don't generate the same
> ts_vector.
What's the problem ? You may configure which dictionaries and in what order
should be used for given type of token (pg_ts_cfgmap table).
Aha, I got your problem:
www=# select * from ts_debug('one hundred');
ts_name | tok_type | description | token | dict_name | tsvector
-----------------+----------+-------------+---------+-----------+----------
default_russian | lword | Latin word | one | {en_stem} | 'one'
default_russian | lword | Latin word | hundred | {en_stem} | 'hundr
'hundred' becames 'hundr'. You may use synonym dictionary which is
rather simple
( see http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_Notes for details ).
Once word is recognized by synonym dictionary it will not pass to
next dictionary ! This is how tsearch2 is working with any dictionary.
>
> Once I figure out how to tweak the parser to parse things they way I
> want, I can expand upon those docs too. Looks like I'm going to need to
> reach waaaay back into my brain and dust off my flex knowledge for that,
> though....
What do you want from parser ?
>
> On Mon, 2004-02-16 at 10:33, Oleg Bartunov wrote:
> > btw, Ben, if you get you dictionary working, could you describe process
> > of developing so other people will appreciate your work. This part of
> > tsearch2 documentation is very weak.
> >
> > Oleg
> >
> > On Mon, 16 Feb 2004, Teodor Sigaev wrote:
> >
> > >
> > >
> > > Ben wrote:
> > > > Thanks for the replies. Just to clarify what I was doing, quaicode
> > > > looked something like:
> > > >
> > > > phrase = palloc(8);
> > > > phrase = "foo\0bar\0";
> > > > res = palloc(3);
> > > > res[0] = phrase[0];
> > > > res[1] = phrase[5];
> > > > res[2] = 0;
> > > >
> > > > That crashed. Once I changed it to:
> > > >
> > > > res = palloc(3);
> > > > res[0] = palloc(4);
> > > > res[0] = "foo\0";
> > > > res[1] = palloc(4);
> > > > res[2] = "bar\0";
> > > > res[3] = 0;
> > > >
> > > > it worked.
> > > >
> > > :)
> > > I hope you mean:
> > > res = palloc(3);
> > > res[0] = palloc(4);
> > > memcpy(res[0] ,"foo", 4);
> > > res[1] = palloc(4);
> > > memcpy(res[1] ,"bar", 4);
> > > res[2] = 0;
> > >
> > > Look at indexes of res.
> > >
> > >
> >
> > Regards,
> > Oleg
> > _____________________________________________________________
> > Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
> > Sternberg Astronomical Institute, Moscow University (Russia)
> > Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
> > phone: +007(095)939-16-83, +007(095)939-23-83
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
From | Date | Subject | |
---|---|---|---|
Next Message | Pascal Polleunus | 2004-02-17 11:39:06 | Re: function returning a record |
Previous Message | Matthew Lunnon | 2004-02-17 11:14:30 | summary aggregate information from a second table |