Quick Links

Some questions about postgresql's default text search parser

From:	johannes graën <johannes(at)selfnet(dot)de>
To:	pgsql-general(at)postgresql(dot)org
Subject:	Some questions about postgresql's default text search parser
Date:	2013-11-06 10:08:05
Message-ID:	CA++JNSf_Dkw=Uy48M5vCWSatC_zzuHae16fOzB8Lu2anQsvVfA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Hi eveyone,

I've been trying to understand the text search parser's behaviour.
Looking at the source code [1] it seems as if there was a
sophisticated FSM mapping the input string to a list of tuples of
category (as defined in [1], lines 32-56, or [2]) and a substring from
the original one order by the appearance of the latter one in the
original.

* Is there any documentation to be found on this parser?

As the parser is not aware of the underlying language, I would like to
create my own one.

* Is adding one to pg_ts_parsers the right way or should this rather
be done outside of the PG internals?
* For the first case, is there any manual or documentation how to do so?

If you want to comprehend my aims, try these commands:

select (ts_parse(3722,s)).*, (ts_debug(s)).*, (ts_debug('french',s)).*
from (select 'aujourd''hui ils m''ont dit qu''il y aura peut-être plus
de 10 000 personnes'::text s) x;

select (ts_parse(3722,s)).*, (ts_debug(s)).* from (select 'heu
d''anar-hi'::text s) x;

Best
Johannes

[1] http://doxygen.postgresql.org/wparser__def_8c_source.html
[2] http://www.postgresql.org/docs/9.3/static/textsearch-parsers.html

Browse pgsql-general by date

	From	Date	Subject
Next Message	Albe Laurenz	2013-11-06 10:28:35	Re: Row Level Access
Previous Message	Maciej Mrowiec	2013-11-06 09:28:41	Row Level Access