Quick Links

Re: Using In Clause For a Large Text Matching Query

From:	Richard Broersma Jr <rabroersma(at)yahoo(dot)com>
To:	Jason Farmer <jfarmer(at)getloaded(dot)com>, pgsql-sql(at)postgresql(dot)org
Subject:	Re: Using In Clause For a Large Text Matching Query
Date:	2006-06-29 20:49:27
Message-ID:	20060629204927.16465.qmail@web31802.mail.mud.yahoo.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-sql

> We can assume a certain portion of the text is included in the DB table,
> so I want to be able to do a substring match on "brown" and "green" and
> in this case return both "brown kitty", and "green doggy". However the
> problem is, if I run the query on each of my 300 rows to scan 200,000
> rows in my DB is entirely too slow. So I was hoping to use the IN clause
> to create an IN group of about 300 items to scan the DB once.

You can probably do it. However, you will have to pick a substring from your text field to
compare against. In this case you seem to be choosing the first word, i.e. "brown" and "green".

so maybe:

select t1.col1
from
table1 as t1,
(
select distinct split_part( tblFileDump.Col1, ' ', 1 ) as samplecol
from tblFileDump
) as fd1

where t1.col1 like '%' || fd1.samplecol || '%'
;

This is just an idea. I've never used split_part or developed a sudo join this way. But it may
work provided you and jump your text files into a temp table.

Notice:
http://www.postgresql.org/docs/8.1/interactive/functions-string.html
for the syntax for split_part().

Regards,

Richard Broersma Jr.

In response to

Using In Clause For a Large Text Matching Query at 2006-06-29 20:00:50 from Jason Farmer

Browse pgsql-sql by date

	From	Date	Subject
Next Message	Jonah H. Harris	2006-06-29 21:03:54	Re: Documentation Generator for pl/pgsql
Previous Message	Daniel Caune	2006-06-29 20:07:04	Documentation Generator for pl/pgsql