Re: Metaphone function attachment

From: mlw <markw(at)mohawksoft(dot)com>
To: Joel Burton <jburton(at)scw(dot)org>
Cc: pgsql-general(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Metaphone function attachment
Date: 2001-05-04 12:02:52
Message-ID: 3AF29A6C.A7E75B66@mohawksoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Joel Burton wrote:
>
> --
> Joel Burton <jburton(at)scw(dot)org>
> Director of Information Systems, Support Center of Washington
>
> -------------------------------------------------------------------------------
> Name: contrib-metaphone.tgz
> contrib-metaphone.tgz Type: unspecified type (APPLICATION/octet-stream)
> Encoding: BASE64
>
> -------------------------------------------------------------------------------
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster

I have written a similar function, and it is in a library I wrote "pgcontains"
which started out as a simple implementation of "contains(...)" but has grown
WAY beyond it's original scope.

Anyway, Metaphone is great for doing some cool searches, but now that we are on
the subject of cool search functions, I have a few that may be useful and would
be glad to contribute. My only question is how? (and I'd have to write them for
7.1, because they are still in 7.0.x format)

contains(...)
A simple implementation of contains. Forces a table scan, but does some cool
things including phrase detection.

decode(...)
Similar to "case" but works as decode for oracle queries. In 7.1 and higher it
should be easy to make one function take a variable number of parameters. Right
now I have a stub for the most common numbers.

strip(...)
Strips out all but alphanumeric characters and returns a lowercase string.
"Oops I did it again" comes back as "oopsididitagain." This is cool for lightly
fuzzy searches.

striprev(...)
Like strip, but reverses the string. Allows you to use an index for records
which end in something. For instance: "select * from table where field like
'abc%'" can use an index, where as "select * from table where field like
'%xyx'" will not. However, "select * from table where striprev(field) like
striprev('xyz') || '%'" can.

Example:
cdinfo=# select title, striprev(title) from ztitles where striprev(title) like
striprev('wall') || '%' limit 3;
title | striprev
--------------------------------------------+------------------------------------
A Giggle Can Wiggle Its Way Through A Wall |
llawahguorhtyawstielggiwnacelggiga
Shadows On A Wall * | llawanoswodahs
The Wall | llaweht
(3 rows)

cdinfo=# explain select title, striprev(title) from ztitles where
striprev(title) like striprev('wall') || '%' limit
3;
NOTICE: QUERY PLAN:

Limit (cost=0.00..10.21 rows=3 width=12)
-> Index Scan using f1 on ztitles (cost=0.00..7579.94 rows=2227 width=12)

EXPLAIN

int64_t *strtonumu(text, int4 base)
Converts a string to a number with an arbitrary base. (Is there a function to
do this already?)

--
I'm not offering myself as an example; every life evolves by its own laws.
------------------------
http://www.mohawksoft.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Dariusz Pietrzak 2001-05-04 12:05:19 Permissions and views.
Previous Message Peter Haworth 2001-05-04 11:49:57 Re: DBD::Pg errstr method doesn't return full error messages

Browse pgsql-hackers by date

  From Date Subject
Next Message Dariusz Pietrzak 2001-05-04 12:05:19 Permissions and views.
Previous Message mlw 2001-05-04 12:02:17 Re: New Linux xfs/reiser file systems