Re: [pg_trgm] Making similarity(?, ?) < ? use an index

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Greg Navis <contact(at)gregnavis(dot)com>
Cc: Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: [pg_trgm] Making similarity(?, ?) < ? use an index
Date: 2016-06-10 16:47:29
Message-ID: CAMkU=1xhNUaS-J455md882-hzmT6x-Uca3=SYymVmVUL4iPM2A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, Jun 10, 2016 at 9:20 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> On Thu, Jun 9, 2016 at 1:57 AM, Greg Navis <contact(at)gregnavis(dot)com> wrote:
>> Artur, no worries, I'm not writing any code ;-)
>>
>> I did the following:
>>
>> CREATE TYPE trgm_match AS (match TEXT, threshold NUMERIC);
>
> I would probably use REAL, not NUMERIC. But maybe there is good
> reason to use NUMERIC.
>
>> CREATE OR REPLACE FUNCTION trgm_check_match (string TEXT, match trgm_match)
>> RETURNS bool
>> AS 'SELECT match.match <-> string <= 1 - match.threshold'
>> LANGUAGE SQL;

You will have to somehow prevent this from getting inlined. If it is
inlined, then it will no longer be
recognized as being an indexable operator. So maybe use plpgsql as
the language.

>> CREATE OPERATOR %(leftarg = text, rightarg = trgm_match,
>> procedure=trgm_check_match);
>>
>> This allows me to write:
>>
>> SELECT ('Warsaw' % row('Warsw', 0.3)::trgm_match);
>>
>> I'm not sure how to make this operator use an index. It seems I need to
>> create an operator class but I'm not sure how. This is how pg_trgm creates
>> its operator class:
>
> I think you should pick a new operator name, not try to reuse %.
> Based on Tom's previous comment that forking is probably not a good
> idea, you probably want the new operator to co-exist with the existing
> one, so it needs a different name. For example, I picked %% without
> giving it a lot of thought for this example below.

On second thought, it could use overloading distinguished with
different argument types, so it doesn't need a different name, but I
don't know if it is a good idea to use that overloading.

Cheers,

Jeff

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Francisco Olarte 2016-06-10 16:49:41 Re: [HACKERS] Online DW
Previous Message David G. Johnston 2016-06-10 16:46:29 Re: What is the general opinion on use of tablespaces