| From: | Andy Colson <andy(at)squeakycode(dot)net> | 
|---|---|
| To: | pgsql-general(at)postgresql(dot)org | 
| Subject: | Re: String searching | 
| Date: | 2014-11-18 14:37:09 | 
| Message-ID: | 546B5995.1030907@squeakycode.net | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-general | 
On 11/17/2014 7:54 PM, Jonathan Vanasco wrote:
>
> On Nov 17, 2014, at 12:55 PM, Robert DiFalco wrote:
>
>>      SELECT * FROM MyTable WHERE upper(FullName) LIKE upper('%John%');
>>
>> That said, which would be the best extension module to use? A "gist" index on the uppercased column? Or something else? Thanks!
>
> Performance wise, I think a function index would probably be the best:
>
> 	CREATE INDEX mytable_lower_fullname_idx ON mytable(lower(fullname));
>
> 	SELECT * FROM mytable WHERE lower(fullname) LIKE lower('%john%');
>
> The only reason why I use `lower` and not `upper` is that it's easier to look at when dealing with debugging and sample queries.
>
> I'd bench against GIN and GIST, but I think this will work the best.
>
> The reason is that GIN/GIST use language patterns to simplify the index.  so they work great on "words"
>
> 	select plainto_tsquery('doing watching reading programming');
> 	'watch' & 'read' & 'program'
>
> but not so great on "names":
>
> 	select plainto_tsquery('john doe');
> 	 'john' & 'doe'
>
> 	select plainto_tsquery('jon doe');
> 	 'jon' & 'doe
>
> So you'll get a bit more overhead on the match and you won't get a smaller index (which is why they're great for fulltext)
>
> The search execution might turn out to be much faster.  If so, i'd love to know.  But doing a lower() search on a lower() function index has always been ridiculously fast for me.
>
> This only goes for names though.  If you're searching other fields, then another search method might be considerably better.
>
Full Text Search has another awesome benefit.  Aliases.  Bob == Robert. 
  I do address searches, and I've created a custom dictionary that says 
st == street, n == north, etc.
So when a person searches for 1st ne, they find all combinations of
1 street north east.
Its indexes, so its fast. Ram and disk are cheap, who cares how big it is.
-Andy
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Jonathan Vanasco | 2014-11-18 16:01:45 | Re: String searching | 
| Previous Message | Adrian Klaver | 2014-11-18 14:35:14 | Re: About the tps explanation of pgbench, please help |