From: | Amit Langote <amitlangote09(at)gmail(dot)com> |
---|---|
To: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Behavior of a pg_trgm index for 2 (or < 3) character LIKE queries |
Date: | 2013-05-30 02:51:27 |
Message-ID: | CA+HiwqH7F1xZAq7KeRGOzRYBruiMb++wMQfxxXPLMiAv_wLJkg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello,
I have been trying to understand how pg_trgm works. As part of that, I
was looking at gin_extract_query_trgm(), which I think, extracts
trigrams from a search query string. So, I debugged for 3 cases:
1) column_name LIKE '%緊急%'
in this case, inside gin_extract_query_trgm(), after a call to
generate_wildcard_trgm(), returned trglen is 0, hence
GIN_SEARCH_MODE_ALL search mode is used.
2) column_name LIKE '%os%'
same as in case (1)
3) column_name LIKE '%ost%'
returned trglen is > 0, things proceed differently. May be, trigrams
have been generated and cane be used for index search.
I later commented out #define KEEPONLYALNUM from
contrib/pg_trgm/trgm.h (following from a related discussion on
-hackers viz. http://www.postgresql.org/message-id/flat/CAHGQGwFJshvV2nGME19wdTW9teFw_w7h2ns4E+YYsjkB9WdWDQ(at)mail(dot)gmail(dot)com#CAHGQGwFJshvV2nGME19wdTW9teFw_w7h2ns4E+YYsjkB9WdWDQ@mail.gmail.com)
but things didn't change.
So, it appears, for search strings consisting of 2 (or < 3)
characters, trigrams can not be utilized. No?
NOTE: Using the master branch. The indexed column is a text field and
data consists of mix of Japanese, alphanumeric characters.
--
Amit Langote
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2013-05-30 03:01:31 | Re: all_visible replay aborting due to uninitialized pages |
Previous Message | Robert Haas | 2013-05-30 02:46:58 | Re: removing PD_ALL_VISIBLE |