Quick Links

How to Optimize pg_trgm Performance

From:	"Igal (at) Lucee(dot)org" <igal(at)lucee(dot)org>
To:	pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject:	How to Optimize pg_trgm Performance
Date:	2018-01-28 23:42:17
Message-ID:	0676f27e-118b-892b-f44d-c60ab13eb37c@lucee.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

I want to use pg_trgm for auto-suggest functionality. I created a
Materialized View with the information that I need, with the relevant
columns being (keywords text, rank int). keywords is the column from
which I build the tri-grams, and rank is some popularity factor so that
popular results will show up higher than less popular results given the
same tri-gram distance.

I want to return results in the order of [distance], [distance_word],
[rank]. The input comes from the user and is not known in advance. My
query is as follows:

    SELECT title
        ,id
        ,(input <-> keywords) AS distance
        ,(input <<-> keywords) AS distance_word
        ,rank
    FROM (VALUES (cast('red pill' AS text))) consts(input)
        ,mv_autosuggest
    ORDER BY 3, 4, 5
    LIMIT 20;

This gives me pretty good results, but it takes too long and is not
likely to scale well.

I have created two indexes but neither seem to be used:

CREATE INDEX mv_autosuggest_keywords_tgrm_gist ON staging.mv_autosuggest
USING gist (keywords gist_trgm_ops);

CREATE INDEX mv_autosuggest_keywords_tgrm_gin ON staging.mv_autosuggest
USING gin (keywords gin_trgm_ops);

This is the result of explain analyze:

QUERY PLAN |
-------------------------------------------------------------------------------------------------------------------------------------------|
Limit (cost=356.41..356.46 rows=20 width=51) (actual
time=163.132..163.135 rows=20
loops=1)                                               |
-> Sort (cost=356.41..372.96 rows=6619 width=51) (actual
time=163.130..163.131 rows=20
loops=1)                                        |
        Sort Key: (('red pill'::text <-> mv_autosuggest.keywords)),
(('red pill'::text <<-> mv_autosuggest.keywords)), mv_autosuggest.rank |
        Sort Method: top-N heapsort Memory: 28kB |
        -> Seq Scan on mv_autosuggest (cost=0.00..180.29 rows=6619
width=51) (actual time=0.263..161.289 rows=6619 loops=1)              |
Planning time: 0.139 ms |
Execution time: 163.174 ms |

How can I improve the performance here?

Thank you,

Igal Sapir
Lucee Core Developer
Lucee.org <http://lucee.org/>

Responses

Re: How to Optimize pg_trgm Performance at 2018-01-30 22:50:46 from Ivan E. Panchenko

Browse pgsql-general by date

	From	Date	Subject
Next Message	David Fetter	2018-01-29 02:57:48	== PostgreSQL Weekly News - January 28 2018 ==
Previous Message	Martin Goodson	2018-01-28 23:11:58	Re: FW: Setting up streaming replication problems