Re: using Tsearch2 for chemical text

From: "Dann Corbit" <DCorbit(at)connx(dot)com>
To: "Rajarshi Guha" <rguha(at)indiana(dot)edu>, "pgsql-general" <pgsql-general(at)postgresql(dot)org>
Subject: Re: using Tsearch2 for chemical text
Date: 2007-07-25 22:52:56
Message-ID: D425483C2C5C9F49B5B7A41F8944154701000820@postal.corporate.connx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Tsearch2 is used for full text indexing. It won't be any faster than a
btree index like the one you have now (I assume it's unique -- if it
isn't then I think it ought to be). If you cluster the table by your
index it will speed up your queries.

> -----Original Message-----
> From: pgsql-general-owner(at)postgresql(dot)org [mailto:pgsql-general-
> owner(at)postgresql(dot)org] On Behalf Of Rajarshi Guha
> Sent: Wednesday, July 25, 2007 3:41 PM
> To: pgsql-general
> Subject: [GENERAL] using Tsearch2 for chemical text
>
> Hi, I have a table with about 9M entries. The table has 2 fields: id
> and name which are of serial and text types respectively. I have a
> ordinary index on the text field which allows me to do searches in
> reasonable time. Most of my searches are of the form
>
> select * from mytable where name ~ 'some text query'
>
> I know that the Tsearch2 module will let me have very efficient text
> searches. But if I understand correctly, it's based on a language
> specific dictionary.
>
> My problem is that the name column contains names of chemicals. Now
> for many cases this may simply be a number (1674-56-2) and in other
> cases it may be an alphanumeric string (such as (-)O-acetylcarnitine
> or 1,2-cis-dihydroxybenzoate). In some cases it is a well-known word
> (say viagra or calcium chloride or pentathol).
>
> My question is: will Tsearch2 be able to handle this type of text? Or
> will it be hampered by the fact that the bulk of the rows do not
> correspond to ordinary English
>
> -------------------------------------------------------------------
> Rajarshi Guha <rguha(at)indiana(dot)edu>
> GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE
> -------------------------------------------------------------------
> My Ethicator machine must have had a built-in moral
> compromise spectral phantasmatron! I'm a genius."
> -Calvin
>
>
>
> ---------------------------(end of
broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Joseph S 2007-07-26 00:39:01 inherting triggers
Previous Message Tom Lane 2007-07-25 22:51:15 Re: using Tsearch2 for chemical text