Re: [pg_trgm] Making similarity(?, ?) < ? use an index

From: Greg Navis <contact(at)gregnavis(dot)com>
To: Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: [pg_trgm] Making similarity(?, ?) < ? use an index
Date: 2016-06-03 11:24:06
Message-ID: CAA6WWt8hQgKcmtHiQ2XfMMyBP-Em_eUR05eTaH84WgOCx6UVJA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Artur, thanks for your reply. That's right, `%` does use the index. The
goal of using `similarity(lhs, rhs) >= show_limit()` was to replace
`show_limit()` with a custom, per-query limit. I noticed that the latter
approach does _not_ use the index, hence my question:

grn=# EXPLAIN ANALYZE SELECT * FROM restaurants WHERE city % 'warsw';
QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on restaurants (cost=24.28..1319.36 rows=515 width=10)
(actual time=96.081..96.456 rows=400 loops=1)
Recheck Cond: ((city)::text % 'warsw'::text)
Heap Blocks: exact=359
-> Bitmap Index Scan on restaurants_city_gist_trgm_idx
(cost=0.00..24.15 rows=515 width=0) (actual time=96.030..96.030 rows=400
loops=1)
Index Cond: ((city)::text % 'warsw'::text)
Planning time: 0.211 ms
Execution time: 96.528 ms
(7 rows)

grn=# EXPLAIN ANALYZE SELECT * FROM restaurants WHERE similarity(city,
'warsw') >= show_limit();
QUERY PLAN

--------------------------------------------------------------------------------------------------------------------
Seq Scan on restaurants (cost=0.00..11692.81 rows=171825 width=10)
(actual time=14.520..692.520 rows=400 loops=1)
Filter: (similarity((city)::text, 'warsw'::text) >= show_limit())
Rows Removed by Filter: 515075
Planning time: 0.109 ms
Execution time: 692.560 ms
(5 rows)

If this functionality isn't supported then it might be a good idea for a
contribution.

Best regards

On Fri, Jun 3, 2016 at 12:51 PM, Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>
wrote:

> Hello.
>
> As I know 'lhs % rhs' is equivalent to 'similarity(lhs, rhs) >=
> show_limit()'.
>
> And so your query should looks like this:
>
> SELECT * FROM restaurants WHERE city % 'warsw';
>
> And it should use index.
>
>
> On 03.06.2016 13:35, Greg Navis wrote:
>
>> Hey!
>>
>> I'm playing with pg_trgm. It seems that `lhs % rhs` is _almost_
>> equivalent to `similarity(lhs, rhs) < show_limit()`. The difference that
>> I noticed is that `%` uses a GIN index while `similarity` does not.
>>
>> ```
>> grn=# \d restaurants
>> Table "public.restaurants"
>> Column | Type | Modifiers
>> --------+------------------------+-----------
>> city | character varying(255) | not null
>> Indexes:
>> "restaurants_city_trgm_idx" gin (city gin_trgm_ops)
>>
>> grn=# SELECT COUNT(*) FROM restaurants;
>> count
>> --------
>> 515475
>> (1 row)
>>
>> Time: 45.964 ms
>> grn=# EXPLAIN ANALYZE SELECT * FROM restaurants WHERE similarity(city,
>> 'warsw') > show_limit();
>> QUERY PLAN
>>
>>
>> --------------------------------------------------------------------------------------------------------------------
>> Seq Scan on restaurants (cost=0.00..11692.81 rows=171825 width=10)
>> (actual time=16.436..665.062 rows=360 loops=1)
>> Filter: (similarity((city)::text, 'warsw'::text) > show_limit())
>> Rows Removed by Filter: 515115
>> Planning time: 0.139 ms
>> Execution time: 665.105 ms
>> (5 rows)
>>
>> Time: 665.758 ms
>> ```
>>
>> My question is: is it possible to make `similarity` use the index? If
>> not, is there a way to speed up the query above?
>>
>> Best regards
>> --
>> Greg Navis
>> I help tech companies to scale Heroku-hosted Rails apps.
>> Free, biweekly scalability newsletter for SaaS CEOs
>> <http://www.gregnavis.com/newsletter/>
>>
>>
>
> --
> Artur Zakirov
> Postgres Professional: http://www.postgrespro.com
> Russian Postgres Company
>

--
Greg Navis
I help tech companies to scale Heroku-hosted Rails apps.
Free, biweekly scalability newsletter for SaaS CEOs
<http://www.gregnavis.com/newsletter/>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Artur Zakirov 2016-06-03 11:58:24 Re: [pg_trgm] Making similarity(?, ?) < ? use an index
Previous Message Artur Zakirov 2016-06-03 10:57:04 Re: [pg_trgm] Per-query set_limit()