bug of pg_trgm?

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: bug of pg_trgm?
Date: 2012-08-07 18:38:31
Message-ID: CAHGQGwGxgUiqyid2yjfZwZmMzn9BZmVjmHua2ZqPvU2KR=5SqA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

When I used pg_trgm, I encountered the problem that the search result of
SeqScan was the different from that of BitmapScan even if the search
keyword was the same. Is this a bug? Here is the test case:

---------------------------
CREATE EXTENSION pg_trgm;
CREATE TABLE tbl (col text);
CREATE INDEX idx ON tbl USING gin (col gin_trgm_ops);
INSERT INTO tbl VALUES ('abc'), ('ab c');

SET enable_seqscan TO off;
SET enable_bitmapscan TO on;
SELECT * FROM tbl WHERE col LIKE E'%\\c%';
col
------
ab c
(1 row)

SET enable_seqscan TO on;
SET enable_bitmapscan TO off;
SELECT * FROM tbl WHERE col LIKE E'%\\c%';
col
------
abc
ab c
(2 rows)
---------------------------

The cause is ISTM that pg_trgm wrongly ignores the heading wildcard
character (i.e., %) when an escape (i.e., \\) follows the wildcard character.
Attached patch fixes this.

The patch fixes another problem: pg_trgm wrongly ignores the backslash \\
following the escape, i.e., \\\\. This problem might be harmless when
KEEPONLYALNUM is enabled because any characters other than
alphabets and digits are ignored. But, when KEEPONLYALNUM is disabled,
\\\\ should be interpreted as a backslash character itself, but
pg_trgm does not.

Regards,

--
Fujii Masao

Attachment Content-Type Size
trgm_bugfix_v1.patch application/octet-stream 678 bytes

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2012-08-07 19:01:12 Re: WIP: pg_pretty_query
Previous Message Tom Lane 2012-08-07 18:14:26 Re: WIP: pg_pretty_query