Re: [pg_trgm] Making similarity(?, ?) < ? use an index

From: Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>
To: Greg Navis <contact(at)gregnavis(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: [pg_trgm] Making similarity(?, ?) < ? use an index
Date: 2016-06-03 11:58:24
Message-ID: c1b61e4b-0a1c-41ac-43a2-ead19f641606@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Oh, I understand. It is because you want different limits for
restaurants and cinemas?

I see only one solution. It is custom extension, which will create
operator class similar to gin_trgm_ops and will depends on pg_trgm. In
gin_trgm_consistent() you can use your own limit variable.

As I know functions do not use indexes.

Of course I may be wrong. And somebody knows a better solution.

On 03.06.2016 14:24, Greg Navis wrote:
> Artur, thanks for your reply. That's right, `%` does use the index. The
> goal of using `similarity(lhs, rhs) >= show_limit()` was to replace
> `show_limit()` with a custom, per-query limit. I noticed that the latter
> approach does _not_ use the index, hence my question:
>
> grn=# EXPLAIN ANALYZE SELECT * FROM restaurants WHERE city % 'warsw';
> QUERY
> PLAN
> ----------------------------------------------------------------------------------------------------------------------------------------------
> Bitmap Heap Scan on restaurants (cost=24.28..1319.36 rows=515
> width=10) (actual time=96.081..96.456 rows=400 loops=1)
> Recheck Cond: ((city)::text % 'warsw'::text)
> Heap Blocks: exact=359
> -> Bitmap Index Scan on restaurants_city_gist_trgm_idx
> (cost=0.00..24.15 rows=515 width=0) (actual time=96.030..96.030
> rows=400 loops=1)
> Index Cond: ((city)::text % 'warsw'::text)
> Planning time: 0.211 ms
> Execution time: 96.528 ms
> (7 rows)
>
> grn=# EXPLAIN ANALYZE SELECT * FROM restaurants WHERE similarity(city,
> 'warsw') >= show_limit();
> QUERY PLAN
>
> --------------------------------------------------------------------------------------------------------------------
> Seq Scan on restaurants (cost=0.00..11692.81 rows=171825 width=10)
> (actual time=14.520..692.520 rows=400 loops=1)
> Filter: (similarity((city)::text, 'warsw'::text) >= show_limit())
> Rows Removed by Filter: 515075
> Planning time: 0.109 ms
> Execution time: 692.560 ms
> (5 rows)
>
> If this functionality isn't supported then it might be a good idea for a
> contribution.
>
> Best regards
>
> On Fri, Jun 3, 2016 at 12:51 PM, Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru
> <mailto:a(dot)zakirov(at)postgrespro(dot)ru>> wrote:
>
> Hello.
>
> As I know 'lhs % rhs' is equivalent to 'similarity(lhs, rhs) >=
> show_limit()'.
>
> And so your query should looks like this:
>
> SELECT * FROM restaurants WHERE city % 'warsw';
>
> And it should use index.
>
>
> On 03.06.2016 13:35, Greg Navis wrote:
>
> Hey!
>
> I'm playing with pg_trgm. It seems that `lhs % rhs` is _almost_
> equivalent to `similarity(lhs, rhs) < show_limit()`. The
> difference that
> I noticed is that `%` uses a GIN index while `similarity` does not.
>
> ```
> grn=# \d restaurants
> Table "public.restaurants"
> Column | Type | Modifiers
> --------+------------------------+-----------
> city | character varying(255) | not null
> Indexes:
> "restaurants_city_trgm_idx" gin (city gin_trgm_ops)
>
> grn=# SELECT COUNT(*) FROM restaurants;
> count
> --------
> 515475
> (1 row)
>
> Time: 45.964 ms
> grn=# EXPLAIN ANALYZE SELECT * FROM restaurants WHERE
> similarity(city,
> 'warsw') > show_limit();
> QUERY PLAN
>
> --------------------------------------------------------------------------------------------------------------------
> Seq Scan on restaurants (cost=0.00..11692.81 rows=171825 width=10)
> (actual time=16.436..665.062 rows=360 loops=1)
> Filter: (similarity((city)::text, 'warsw'::text) > show_limit())
> Rows Removed by Filter: 515115
> Planning time: 0.139 ms
> Execution time: 665.105 ms
> (5 rows)
>
> Time: 665.758 ms
> ```
>
> My question is: is it possible to make `similarity` use the
> index? If
> not, is there a way to speed up the query above?
>
> Best regards
> --
> Greg Navis
> I help tech companies to scale Heroku-hosted Rails apps.
> Free, biweekly scalability newsletter for SaaS CEOs
> <http://www.gregnavis.com/newsletter/>
>
>
>
> --
> Artur Zakirov
> Postgres Professional: http://www.postgrespro.com
> Russian Postgres Company
>
>
>
>
> --
> Greg Navis
> I help tech companies to scale Heroku-hosted Rails apps.
> Free, biweekly scalability newsletter for SaaS CEOs
> <http://www.gregnavis.com/newsletter/>
>

--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

In response to

Browse pgsql-general by date

  From Date Subject
Next Message David G. Johnston 2016-06-03 12:05:39 Re: [pg_trgm] Making similarity(?, ?) < ? use an index
Previous Message Greg Navis 2016-06-03 11:24:06 Re: [pg_trgm] Making similarity(?, ?) < ? use an index