Quick Links

Re: tsearch2, large data and indexes

From:	Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>
To:	Ivan Voras <ivoras(at)freebsd(dot)org>
Cc:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: tsearch2, large data and indexes
Date:	2014-04-23 22:56:32
Message-ID:	CAL_0b1vo9ibxHG3MayGKNxiOJH3is3kav5E1x+eBQZv6QC4Wow@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On Wed, Apr 23, 2014 at 4:08 AM, Ivan Voras <ivoras(at)freebsd(dot)org> wrote:
> Ok, I found out what is happening, quoting from the documentation:
>
> "GIN indexes are not lossy for standard queries, but their performance
> depends logarithmically on the number of unique words. (However, GIN
> indexes store only the words (lexemes) oftsvector values, and not
> their weight labels. Thus a table row recheck is needed when using a
> query that involves weights.)"
>
> My query doesn't have weights but the tsvector in the table has them -
> I take it this is what is meant by "involves weights."
>
> So... there's really no way for tsearch2 to produce results based on
> the index alone, without recheck? This is... limiting.

My guess is that you could use strip() function [1] to get rid of
weights in your table or, that would probably be better, in your index
only by using expressions in it and in the query, eg.

...USING gin (strip(fts_data))

and

... WHERE strip(fts_data) @@ q

[1] http://www.postgresql.org/docs/9.3/static/textsearch-features.html

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray(dot)ru(at)gmail(dot)com

In response to

Re: tsearch2, large data and indexes at 2014-04-23 11:08:25 from Ivan Voras

Responses

Re: tsearch2, large data and indexes at 2014-04-24 11:34:22 from Heikki Linnakangas

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Heikki Linnakangas	2014-04-24 11:34:22	Re: tsearch2, large data and indexes
Previous Message	Josh Berkus	2014-04-23 18:29:00	Re: HFS+ pg_test_fsync performance