From: | Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> |
---|---|
To: | Rajarshi Guha <rguha(at)indiana(dot)edu> |
Cc: | pgsql-general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: using Tsearch2 for chemical text |
Date: | 2007-07-26 05:53:45 |
Message-ID: | Pine.LNX.4.64.0707260950280.18739@sn.sai.msu.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Wed, 25 Jul 2007, Rajarshi Guha wrote:
> Hi, I have a table with about 9M entries. The table has 2 fields: id and name
> which are of serial and text types respectively. I have a ordinary index on
> the text field which allows me to do searches in reasonable time. Most of my
> searches are of the form
>
> select * from mytable where name ~ 'some text query'
>
> I know that the Tsearch2 module will let me have very efficient text
> searches. But if I understand correctly, it's based on a language specific
> dictionary.
wrong ! it comes with some written human language dictionaries, but you can
write your very own dictionaries. dictionary is just a C-program.
>
> My problem is that the name column contains names of chemicals. Now for many
> cases this may simply be a number (1674-56-2) and in other cases it may be an
> alphanumeric string (such as (-)O-acetylcarnitine or
> 1,2-cis-dihydroxybenzoate). In some cases it is a well-known word (say viagra
> or calcium chloride or pentathol).
>
> My question is: will Tsearch2 be able to handle this type of text? Or will it
> be hampered by the fact that the bulk of the rows do not correspond to
> ordinary English
Oh, sure. See, for example, our dict_regex dictionary, we use for
astronomical search.
http://lynx.sao.ru/~karpov/software/postgres_dict_regex.html
This is a work in progress, but it works.
>
> -------------------------------------------------------------------
> Rajarshi Guha <rguha(at)indiana(dot)edu>
> GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE
> -------------------------------------------------------------------
> My Ethicator machine must have had a built-in moral
> compromise spectral phantasmatron! I'm a genius."
> -Calvin
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru)
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
From | Date | Subject | |
---|---|---|---|
Next Message | Oleg Bartunov | 2007-07-26 06:08:37 | Re: using Tsearch2 for chemical text |
Previous Message | Naz Gassiep | 2007-07-26 05:53:05 | Re: using Tsearch2 for chemical text |