Re: tsearch2, large data and indexes

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Ivan Voras <ivoras(at)freebsd(dot)org>
Cc: postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: tsearch2, large data and indexes
Date: 2014-04-23 13:00:11
Message-ID: 5357B95B.2050008@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 04/22/2014 10:57 AM, Ivan Voras wrote:
> On 22 April 2014 08:40, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> wrote:
>> On 04/20/2014 02:15 AM, Ivan Voras wrote:
>>> More details: after thinking about it some more, it might have
>>> something to do with tsearch2 and indexes: the large data in this case
>>> is a tsvector, indexed with GIN, and the query plan involves a
>>> re-check condition.
>>>
>>> The query is of the form:
>>> SELECT simple_fields FROM table WHERE fts @@ to_tsquery('...').
>>>
>>> Does the "re-check condition" mean that the original tsvector data is
>>> always read from the table in addition to the index?
>>
>> Yes, if the re-check condition involves the fts column. I don't see why you
>> would have a re-check condition with a query like that, though. Are there
>> some other WHERE-conditions that you didn't show us?
>
> Yes, I've read about tsearch2 and GIN indexes and there shouldn't be a
> recheck condition - but there is.
> This is the query:
>
> SELECT documents.id, title, raw_data, q, ts_rank(fts_data, q, 4) AS
> rank, html_filename
> FROM documents, to_tsquery('document') AS q
> WHERE fts_data @@ q
> ORDER BY rank DESC LIMIT 25;

It's the ranking that's causing the detoasting. "ts_rank(fts_data, q,
4)" has to fetch the contents of the fts_data column.

Sorry, I was confused earlier: the "Recheck Cond:" line is always there
in the EXPLAIN output of bitmap index scans, even if the recheck
condition is never executed at runtime. It's because the executor has to
be prepared to run the recheck-condition, if the bitmap grows large
enough to become "lossy", so that it only stores the page numbers of
matching tuples, not the individual tuples

- Heikki

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message RichmondDyes 2014-04-23 17:59:07 Re: Adding new field to big table
Previous Message Matheus de Oliveira 2014-04-23 12:26:30 Re: tsearch2, large data and indexes