| From: | Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au> |
|---|---|
| To: | "W(dot)H(dot) van Atteveldt" <wouter(at)2at(dot)nl> |
| Cc: | pgsql-performance(at)postgresql(dot)org |
| Subject: | Re: Postgres query optimization with varchar fields |
| Date: | 2004-06-03 01:42:46 |
| Message-ID: | 40BE8216.5000802@familyhealth.com.au |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-performance |
> I am investigating whether it is useful to directly query a database
> containing a rather large text corpus (order of magnitude 100k - 1m
> newspaper articles, so around 100 million words), or whether I should
> use third party text indexing services. I want to know things such as:
> how often is a certain word (or pattern) mentioned in an article and how
> often it is mentioned with the condition that another word is nearby
> (same article or n words distant).
You really want to use the contrib/tsearch2 module that comes already
with PostgreSQL.
cd contrib/tsearch2
gmake install
psql <mydb> < tsearch2.sql
more README.tsearch2
Chris
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Matthew Nuzum | 2004-06-03 02:00:33 | Re: PostgreSQL on VMWare vs Windows vs CoLinux |
| Previous Message | Greg Stark | 2004-06-02 21:39:04 | Re: PostgreSQL on VMWare vs Windows vs CoLinux |