Re: Question regarding specifics of GIN and pg_trgm performance and potential use of show_trgm to improve it

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Pavel Horal <pavel(dot)horal(at)orchitech(dot)cz>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Question regarding specifics of GIN and pg_trgm performance and potential use of show_trgm to improve it
Date: 2023-09-24 23:53:31
Message-ID: CAMkU=1x2VvNSH_wHprQhffjq9CQJmuADBCsXhn7d05v_jUXMzg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, May 24, 2023 at 4:35 PM Pavel Horal <pavel(dot)horal(at)orchitech(dot)cz>
wrote:

I didn't see your email when first sent, and stumbled upon it while
searching for something else. But it still might be worthwhile commenting
even after all of this time.

>
>
*Is my understanding correct that this happens only because pg_trgm is not
> able to actually determine if the matched item from the index search is
> actually much much longer than the query?* Is there any way how the
> performance can be improved in this case? I thought that I can store number
> of trigrams in the index, but that is not being used by the query planner:
>
> CREATE INDEX test_idx2 ON test USING GIN (value gin_trgm_ops,
> array_length(show_trgm(value), 1));
>
> EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM test WHERE value % 'lorem' AND
> array_length(show_trgm(value), 1) < array_length(show_trgm('lorem'), 1) /
> 0.5;
>

The main problem here is of expression type. You have an index using an
expression returning an int, while you are comparing it to an expression
returning a numeric. That inhibits the use of the index over that
expression.

Just casting the type when creating the index is enough (given your test
case) to get this to do what you want:

CREATE INDEX test_idx2 ON test USING GIN (value gin_trgm_ops,
(array_length(show_trgm(value), 1)::numeric));

However, it would probably be more efficient to partition the table on the
trigram count, rather than adding that count to the index. Then it could
just skip any partition with too many trigrams.

Cheers,

Jeff

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Raivo Rebane 2023-09-25 08:36:27 Right version of jdbc
Previous Message Laurenz Albe 2023-09-24 18:18:58 Re: jdbc problem