Re: Updated tsearch documentation

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Updated tsearch documentation
Date: 2007-07-17 09:58:46
Message-ID: Pine.LNX.4.64.0707171352190.20068@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-advocacy pgsql-hackers

On Tue, 17 Jul 2007, Oleg Bartunov wrote:

> On Tue, 17 Jul 2007, Bruce Momjian wrote:
>
>> I think the tsearch documentation is nearing completion:
>>
>> http://momjian.us/expire/fulltext/HTML/textsearch.html
>>
>> but I am not happy with how tsearch is enabled in a user table:
>>
>> http://momjian.us/expire/fulltext/HTML/textsearch-app-tutorial.html
>>
>> Aside from the fact that it needs more examples, it only illustrates an
>> example where someone creates a table, populates it, then adds a
>> tsvector column, populates that, then creates an index.
>>
>> That seems quite inflexible. Is there a way to avoid having a separate
>> tsvector column? What happens if the table is dynamic? How is that
>> column updated based on table changes? Triggers? Where are the
>> examples? Can you create an index like this:
>
> I agree, that there are could be more examples, but text search doesn't
> require something special !
> *Example* of trigger function is documented on
> http://momjian.us/expire/fulltext/HTML/textsearch-opfunc.html
>

Bruce,

below is an example of trigger for insert/update of example table

create function pgweb_update() returns trigger as
$$
BEGIN
NEW.textsearch_index=
setweight( to_tsvector( coalesce (title,'')), 'A' ) || ' ' ||
setweight( to_tsvector(coalesce (body,'')),'D'); RETURN NEW;
END;
$$
language plpgsql;

CREATE TRIGGER fts_update BEFORE INSERT OR UPDATE ON pgweb
FOR EACH ROW EXECUTE PROCEDURE pgweb_update();

>
>>
>> CREATE INDEX textsearch_id ON pgweb USING gin(to_tsvector(column));
>>
>> That avoids having to have a separate column because you can just say:
>>
>> WHERE to_query('XXX') @@ to_tsvector(column)
>
> yes, it's possible, but without ranking, since currently it's impossible to
> store any information in index (it's pg's feature). btw, this should
> works and for GiST index also.
>
> That kind of search is useful if there is another natural ordering of search
> results, for example, by timestamp.
>
>>
>> How do we make sure that the to_query is using the same text search
>> configuration as the 'column' or index? Perhaps we should suggest:
>
> please, keep in mind, it's not mandatory to use the same configuration
> at search time, that was used at index creation.
>

one example is when text search index created without taking into account
stop-words. Then you could search famous 'to be or not to be' with the
same configuration, or ignore stop words with other.

>>
>> CREATE INDEX textsearch_idx ON pgweb USING
>> gin(to_tsvector('english',column));
>>
>> so that at least the configuration is documented in the index.
>
> yes, it's better to always explicitly specify configuration name and not rely
> on default configuration. Unfortunately, configuration name doesn't saved in
> the index.
>
> Regards,
> Oleg
> _____________________________________________________________
> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru)
> Sternberg Astronomical Institute, Moscow University, Russia
> Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
> phone: +007(495)939-16-83, +007(495)939-23-83
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru)
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

In response to

Browse pgsql-advocacy by date

  From Date Subject
Next Message Jim Nasby 2007-07-17 16:04:17 Interesting blog about recent press-release flap (Hi LewisC)
Previous Message Oleg Bartunov 2007-07-17 08:14:24 Re: Updated tsearch documentation

Browse pgsql-hackers by date

  From Date Subject
Next Message Zeugswetter Andreas ADI SD 2007-07-17 10:00:30 Re: Straightforward changes for increased SMP scalability
Previous Message Heikki Linnakangas 2007-07-17 09:09:51 Re: Altering a plan