Re: ERROR: Gin doesn't support full scan due to it's awful

From: Charlie Savage <cfis(at)savagexi(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: ERROR: Gin doesn't support full scan due to it's awful
Date: 2006-09-05 23:26:40
Message-ID: 44FE07B0.3050803@savagexi.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Tom,

Thanks for the quick reply.

>> Sorry, mistyped the query causing the problem. It is:
>
>> select *
>> from maps, features
>> where maps.query @@ features.tags_vector;
>
> In that case it's fair to ask what query values you have stored in maps.
> In particular I imagine that you'll find that a specific query is
> causing the problem ...
>
> regards, tom lane

Interesting...that seems to be the case. For example, this will fail:

explain analyze
select *
from test.features
where to_tsquery('') @@ features.vector

ERROR: Gin doesn't support full scan due to it's awful inefficiency

Interestingly this works:

explain analyze
select *
from test.features
where NULL @@ features.vector

Here is a slightly bigger test case:

--drop schema test cascade;
create schema test;

CREATE TABLE test.maps
(
id serial,
query tsquery
);

CREATE TABLE test.features
(
id serial,
vector tsvector
);

CREATE INDEX features_vector ON test.features USING gin (vector);

INSERT INTO test.maps (query)
VALUES (to_tsquery(''));

INSERT INTO test.features (vector)
VALUES (to_tsvector('test'));

analyze test.maps;
analyze test.features;

----------

Now try this, which won't work (ERROR: Gin doesn't support full scan
due to it's awful inefficiency):

set enable_seqscan to off;

explain
select *
from test.maps, test.features
where features.vector @@ maps.query

Nested Loop (cost=100000000.00..100000004.04 rows=1 width=36)
-> Seq Scan on maps (cost=100000000.00..100000001.01 rows=1 width=12)
-> Index Scan using features_vector on features (cost=0.00..3.01
rows=1 width=24)
Index Cond: (features.vector @@ "outer".query)

However, this works:

set enable_seqscan to on;
set enable_indexscan to off;
set enable_bitmapscan to off;

explain analyze
select *
from test.maps, test.features
where features.vector @@ maps.query

Nested Loop (cost=200000000.00..200000002.03 rows=1 width=36) (actual
time=0.055..0.055 rows=0 loops=1)
Join Filter: ("inner".vector @@ "outer".query)
-> Seq Scan on maps (cost=100000000.00..100000001.01 rows=1
width=12) (actual time=0.011..0.014 rows=1 loops=1)
-> Seq Scan on features (cost=100000000.00..100000001.01 rows=1
width=24) (actual time=0.006..0.010 rows=1 loops=1)
Total runtime: 0.129 ms

You see the same things if you put a NULL in the query column (unlike
above). If instead, you do this in the script above:

INSERT INTO test.maps (query)
VALUES (to_tsquery('test'));

Then it always works.

Seems like the moral of the story, tsquery values of '' or NULL don't
work.

That is surprising to me - maybe the documentation should point out this
issue?

Thanks,

Charlie

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Richard Broersma Jr 2006-09-06 00:03:22 Re: Removing duplicate keys and updating deleted entry key in other table
Previous Message gustavo halperin 2006-09-05 23:08:15 Re: [OT] sig sizes (was Re: Porting from ...)