From: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
---|---|
To: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | pg_trgm partial-match |
Date: | 2012-11-15 19:39:21 |
Message-ID: | CAHGQGwFJshvV2nGME19wdTW9teFw_w7h2ns4E+YYsjkB9WdWDQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
I'd like to propose to extend pg_trgm so that it can compare a partial-match
query key to a GIN index. IOW, I'm thinking to implement the 'comparePartial'
GIN method for pg_trgm.
Currently, when the query key is less than three characters, we cannot use
a GIN index (+ pg_trgm) efficiently, because pg_trgm doesn't support a
partial-match method. In this case, seq scan or index full scan would be
executed, and its response time would be very slow. I'd like to alleviate this
problem.
Note that we cannot do a partial-match if KEEPONLYALNUM is disabled,
i.e., if query key contains multibyte characters. In this case, byte length of
the trigram string might be larger than three, and its CRC is used as a
trigram key instead of the trigram string itself. Because of using CRC, we
cannot do a partial-match. Attached patch extends pg_trgm so that it
compares a partial-match query key only when KEEPONLYALNUM is
enabled.
Attached patch is WIP yet. What I should do next is:
* version up pg_trgm from 1.0 to 1.1, i.e., create pg_trgm--1.1.sql, etc.
* write the regression test
Comments? Review? Objection?
Regards,
--
Fujii Masao
Attachment | Content-Type | Size |
---|---|---|
trgm_compare_partial_v0.patch | application/octet-stream | 5.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2012-11-15 19:41:30 | Re: Dumping an Extension's Script |
Previous Message | Peter Geoghegan | 2012-11-15 19:36:48 | Re: tuplesort memory usage: grow_memtuples |