Re: String searching

From: Andy Colson <andy(at)squeakycode(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: String searching
Date: 2014-11-18 14:37:09
Message-ID: 546B5995.1030907@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 11/17/2014 7:54 PM, Jonathan Vanasco wrote:
>
> On Nov 17, 2014, at 12:55 PM, Robert DiFalco wrote:
>
>> SELECT * FROM MyTable WHERE upper(FullName) LIKE upper('%John%');
>>
>> That said, which would be the best extension module to use? A "gist" index on the uppercased column? Or something else? Thanks!
>
> Performance wise, I think a function index would probably be the best:
>
> CREATE INDEX mytable_lower_fullname_idx ON mytable(lower(fullname));
>
> SELECT * FROM mytable WHERE lower(fullname) LIKE lower('%john%');
>
> The only reason why I use `lower` and not `upper` is that it's easier to look at when dealing with debugging and sample queries.
>
> I'd bench against GIN and GIST, but I think this will work the best.
>
> The reason is that GIN/GIST use language patterns to simplify the index. so they work great on "words"
>
> select plainto_tsquery('doing watching reading programming');
> 'watch' & 'read' & 'program'
>
> but not so great on "names":
>
> select plainto_tsquery('john doe');
> 'john' & 'doe'
>
> select plainto_tsquery('jon doe');
> 'jon' & 'doe
>
> So you'll get a bit more overhead on the match and you won't get a smaller index (which is why they're great for fulltext)
>
> The search execution might turn out to be much faster. If so, i'd love to know. But doing a lower() search on a lower() function index has always been ridiculously fast for me.
>
> This only goes for names though. If you're searching other fields, then another search method might be considerably better.
>

Full Text Search has another awesome benefit. Aliases. Bob == Robert.
I do address searches, and I've created a custom dictionary that says
st == street, n == north, etc.

So when a person searches for 1st ne, they find all combinations of
1 street north east.

Its indexes, so its fast. Ram and disk are cheap, who cares how big it is.

-Andy

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Jonathan Vanasco 2014-11-18 16:01:45 Re: String searching
Previous Message Adrian Klaver 2014-11-18 14:35:14 Re: About the tps explanation of pgbench, please help