From: | Vyacheslav Kalinin <vka(at)mgcp(dot)com> |
---|---|
To: | General postgres mailing list <pgsql-general(at)postgresql(dot)org> |
Subject: | Incorrect FTS query results with GIN index |
Date: | 2010-01-15 18:19:24 |
Message-ID: | 9b1af80e1001151019p6248c353pe29c0c405d44642b@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hello,
Basically, I started testing prefix matching in FTS and got into troubles.
Self-contained example follows:
postgres=# select version();
version
------------------------------------------------------------------------------------------------------------
PostgreSQL 8.4.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.1.2
20080704 (Red Hat 4.1.2-44), 32-bit
postgres=# create table test (id int, data text, tsvd tsvector);
postgres=# insert into test (id, data) values (1, 'hot stuff is here'), (2,
'light is hotter than dark'), (3, 'nothing is that hottie');
postgres=# update test set tsvd = to_tsvector('english', data);
postgres=# select * from test;
id | data | tsvd
----+---------------------------+-------------------------------
1 | hot stuff is here | 'hot':1 'stuff':2
2 | light is hotter than dark | 'dark':5 'hotter':3 'light':1
3 | nothing is that hottie | 'hotti':4 'noth':1
(3 rows)
Now let's play with queries:
postgres=# select * from test where tsvd @@ to_tsquery('english', 'hot');
id | data | tsvd
----+-------------------+-------------------
1 | hot stuff is here | 'hot':1 'stuff':2
(1 row)
postgres=# select * from test where tsvd @@ to_tsquery('english', 'hot:*');
id | data | tsvd
----+---------------------------+-------------------------------
1 | hot stuff is here | 'hot':1 'stuff':2
2 | light is hotter than dark | 'dark':5 'hotter':3 'light':1
3 | nothing is that hottie | 'hotti':4 'noth':1
(3 rows)
postgres=# select * from test where tsvd @@ to_tsquery('english', 'hot:* |
hot');
id | data | tsvd
----+---------------------------+-------------------------------
1 | hot stuff is here | 'hot':1 'stuff':2
2 | light is hotter than dark | 'dark':5 'hotter':3 'light':1
3 | nothing is that hottie | 'hotti':4 'noth':1
(3 rows)
Looks good so far. Let's introduce an index:
postgres=# create index ix_test on test using gin(tsvd);
CREATE INDEX
postgres=# set enable_seqscan to off;
SET
First two queries result in the same row sets, but look at the third one:
postgres=# explain select * from test where tsvd @@
to_tsquery('english', 'hot:* | hot');
QUERY PLAN
----------------------------------------------------------------------
Bitmap Heap Scan on test (cost=4.26..8.28 rows=1 width=68)
Recheck Cond: (tsvd @@ to_tsquery('english', 'hot:* | hot'::text))
-> Bitmap Index Scan on ix_test (cost=0.00..4.26 rows=1 width=0)
Index Cond: (tsvd @@ to_tsquery('english', 'hot:* | hot'::text))
postgres=# select * from test where tsvd @@ to_tsquery('english', 'hot:* |
hot');
id | data | tsvd
----+-------------------+-------------------
1 | hot stuff is here | 'hot':1 'stuff':2
(1 row)
WTH? Apparently prefixed part of the query stopped working.
Interesting that the bug doesn't show up with GiST:
postgres=# drop index ix_test;
DROP INDEX
postgres=# create index ix_test on test using gist(tsvd);
CREATE INDEX
postgres=# select * from test where tsvd @@ to_tsquery('english', 'hot:* |
hot');
id | data | tsvd
----+---------------------------+-------------------------------
1 | hot stuff is here | 'hot':1 'stuff':2
2 | light is hotter than dark | 'dark':5 'hotter':3 'light':1
3 | nothing is that hottie | 'hotti':4 'noth':1
(3 rows)
From | Date | Subject | |
---|---|---|---|
Next Message | Merlin Moncure | 2010-01-15 18:25:56 | Re: Calling a plpgsql function with composite type as parameter? |
Previous Message | Aaron | 2010-01-15 18:03:59 | Re: Creation of tablespaces |