Using In Clause For a Large Text Matching Query

From: Jason Farmer <jfarmer(at)getloaded(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: Using In Clause For a Large Text Matching Query
Date: 2006-06-29 20:00:50
Message-ID: 44A43172.3040202@getloaded.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Hello all, my first post to the pgsql mailing list!

There may be a completely better way to do this; if so please help point
me in the right direction!

What I'm trying to do is run a query to partially match 300+ text fields
to a PG table of about 200,000 rows. The idea is to pull out a portion
of text from the original text fields and do a partial text match on
anything in my PG table containing that substring.

I liked the idea of using a where IN(group) to do my comparisons, as in

select col1 from table1 where col1 in ('text1','text2')

however, this requires an exact string match. Is there any way to do a
substring match inside of my IN group? Or can anyone think of a better
way to do something like this?

Heres an example of something of how I'd like this to work:

Portion of 300 Original Text fields:
"brown cat"
"green dog"

2 rows of 200k+ Database table:
"brown kitty"
"green doggy"

We can assume a certain portion of the text is included in the DB table,
so I want to be able to do a substring match on "brown" and "green" and
in this case return both "brown kitty", and "green doggy". However the
problem is, if I run the query on each of my 300 rows to scan 200,000
rows in my DB is entirely too slow. So I was hoping to use the IN clause
to create an IN group of about 300 items to scan the DB once.

I hope this makes sense, but if anything sounds confusing please let me
know, and I will be sure to clarify! Thanks for any help or direction
anyone can provide!!

- Jason Farmer

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Daniel Caune 2006-06-29 20:07:04 Documentation Generator for pl/pgsql
Previous Message operationsengineer1 2006-06-29 18:02:44 Re: Data Entry and Query forms