From: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
---|---|
To: | Ivan Voras <ivoras(at)freebsd(dot)org> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Trigram (pg_trgm) GIN index not used |
Date: | 2013-02-21 14:23:34 |
Message-ID: | CAHyXU0xZWHhHQjPjXKKsbS2PSX+O=+Wa-p-pw49varaOoWCo_A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Thu, Feb 21, 2013 at 6:06 AM, Ivan Voras <ivoras(at)freebsd(dot)org> wrote:
> On 21/02/2013 12:52, Ivan Voras wrote:
>
>> I'd like to use pg_trgm for matching substrings case-insensitively, but
>> it doesn't seem to use the index:
>
>
> As a sort-of followup, the '%' operator kind of works but takes
> incredibly long time, and the selectivity estimates are completely wrong:
>
> nn=> vacuum analyze documents;
> VACUUM
> nn=> explain select id,title from documents where raw_data % 'zagreb';
> QUERY PLAN
>
> -----------------------------------------------------------------------------------------
> Bitmap Heap Scan on documents (cost=128.42..330.87 rows=54 width=108)
> Recheck Cond: ((raw_data)::text % 'zagreb'::text)
> -> Bitmap Index Scan on documents_raw_data_trgm (cost=0.00..128.40
> rows=54 width=0)
> Index Cond: ((raw_data)::text % 'zagreb'::text)
> (4 rows)
>
> nn=> explain analyze select id,title from documents where raw_data %
> 'zagreb';
> QUERY
> PLAN
> -----------------------------------------------------------------------------------------------------------------------------------------
> Bitmap Heap Scan on documents (cost=128.42..330.87 rows=54 width=108)
> (actual time=98750.283..98750.283 rows=0 loops=1)
> Recheck Cond: ((raw_data)::text % 'zagreb'::text)
> -> Bitmap Index Scan on documents_raw_data_trgm (cost=0.00..128.40
> rows=54 width=0) (actual time=26.748..26.748 rows=51874 loops=1)
> Index Cond: ((raw_data)::text % 'zagreb'::text)
> Total runtime: 98750.623 ms
> (5 rows)
>
>
> There is no IO load during this query.
pg_trgm is not really designed for indexing large documents, but for
fuzzy simple string (company name, address, etc) matching. probably
better off with full text search.
merlin
From | Date | Subject | |
---|---|---|---|
Next Message | Merlin Moncure | 2013-02-21 15:10:47 | Re: subselects vs WITH in views |
Previous Message | Ioana Danes | 2013-02-21 12:28:20 | Ioana Danes |