testing/predicting optimization using indexes

From: TJ O'Donnell <tjo(at)acm(dot)org>
To: pgsql-sql(at)postgresql(dot)org
Subject: testing/predicting optimization using indexes
Date: 2005-01-22 00:51:19
Message-ID: 41F1A387.5040402@acm.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

I have several questions reagaring the kind of increase in speed I can
expect when I use a multi-column index. Here's what I've done so far.

I've written some search functions which operate on character varying
data used to represent molecular structures. We call this a Smiles string.
I want to optimize the search using an index. As a test, I've created
9 integer columns in the tables containting atom counts, e.g.
number of carbon atoms, oxygen, aromatic carbon, etc.
I then made a multi-column index. Here are some samples times

1. When the table contains only smiles, no 9 integer columns and no index:
Select count(smiles) from structure where oe_matches(smiles,'c1ccccc1CC(=O)NC');
1313 rows in about 15 seconds.

2. When the table contains smiles and the 9 integer columns as an index:
Select count(smiles) from structure where oe_matches(smiles,'c1ccccc1CC(=O)NC');
1313 rows in about 20 seconds.

3. When the table contains smiles and the 9 integer columns as an index:
Select smiles,id from structure where
(nc,nn,no,ns,"n-arom-c","n-arom-n","n-arom-o","n-arom-s",nhalo) >=
(3,1,1,0,6,0,0,0,0)
and
oe_matches(smiles,'c1ccccc1CC(=O)NC');
1313 rows in about 7 seconds.

I'm quite happy with the speedup in 3, but puzzled over the slowdown in 2.
Here are my questions.
1. Why does the search slow down after I've created the extra columns and
index, even when I don't ask to use the index in the SQL, as in 2.
2. Since I get such a nice speedup in 3, should I go to the trouble to
create a new datatype (smiles) and define how it should be indexed in a way
analogous to the 9 integer columns? In other words, could I expect an even
greater speedup using a new datatype and index?

Thanks,
TJ

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message adam etienne 2005-01-22 12:51:20 update from multiple rows
Previous Message Michael Fuhr 2005-01-21 17:04:38 Re: OID's