From: | Dav Coleman <dav(at)danger-island(dot)com> |
---|---|
To: | Josh Berkus <josh(at)agliodbs(dot)com> |
Cc: | pgsql-sql(at)postgresql(dot)org |
Subject: | Re: SQL Query Optimization |
Date: | 2002-04-18 16:35:27 |
Message-ID: | 20020418093527.I24932@danger-island.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
I should be more clear, the problem is that the application user can
basically construct the SQL query dynamically, so I have no control on
how the original SQL query will be formed or what it will consist of.
It can be any possible query in practice. Because of this, it is not just
a matter of analyzing any specific queries, and i don't want to start
creating every possible index (although i might, if i have to).
But I can see where I was heading in the wrong direction already. I was
thinking that what I needed was to find theories/algorithms on how to
rewrite the SQL before submitting it to postgresql, and I maybe still
need to do that, but I guess I also need to EXPLAIN and analyze the
bad vs good forms of the queries so I'll know what makes a 'good' vs
'bad' query (so I'll get a sense on how to rewrite queries). Perhaps
with that understanding, an algorithm for rewriting the queries will
be apparent.
I just figured I couldn't be the first person to run into this problem,
but I can't find it mentioned anywhere.
-Dav
btw, I'm running postgresql-7.1.2 (compilied from source) on rh7.0
Josh Berkus [josh(at)agliodbs(dot)com] wrote:
> Dav,
>
> > I am using postgresql to house chemical informatics data which
> > consists of
> > several interlinked tables with tens of thousands (maximum) of rows.
> > When
> > doing search queries against these tables (which always requires
> > multiple
> > joins) I have noticed that the semantically equivalent SQL queries
> > can differ
> > vastly in speed performance depending on the order of clauses ANDed
> > together ( "WHERE cond1 AND cond2" takes forever, but "WHERE cond2
> > AND cond1" comes right back).
>
> In most cases, the above kind of optimization difference is due to how
> you indexed the table. If, for example, you have an index on (field2,
> field1), and you do a "WHERE field1 = y and field2 = x" then the query
> parser probably won't use the index because the field order is
> different.
>
> Fortunately, in Postgres 7.2, you now get index usage statistics.
> Hopefully another user will follow-up this e-mail by explaining how to
> access them.
>
> The idea is that, if you find that certain views and queries are very
> slow, then check what tables they all have in common. Then check the
> indexes and statistics for each table. If you see a large table with
> only 3 indexes, none of which are getting much use, then they are
> pobpably the wrong indexes or you need to change the structure of your
> WHERE clause. Also, EXPLAIN can be a big help here.
>
> See http://techdocs.postgresql.org for more stuff about optimization.
>
> I can understand that you'd like a tool to make all this easier for
> you, but I haven't seen any such thing, unless it ships with the
> Enterprise version of Oracle.
>
> -Josh Berkus
--
Dav Coleman
http://www.danger-island.com/dav/
From | Date | Subject | |
---|---|---|---|
Next Message | Jan Wieck | 2002-04-18 16:43:30 | Re: call the same pl/pgsql procedure twice in the same connection |
Previous Message | Tom Lane | 2002-04-18 16:15:43 | Re: SQL Query Optimization |