Re: Wildly erratic query performance

From: "Dann Corbit" <DCorbit(at)connx(dot)com>
To: "Eric Schwarzenbach" <subscriber(at)blackbrook(dot)org>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Wildly erratic query performance
Date: 2008-10-31 20:09:23
Message-ID: D425483C2C5C9F49B5B7A41F89441547010011F6@postal.corporate.connx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> -----Original Message-----
> From: pgsql-general-owner(at)postgresql(dot)org [mailto:pgsql-general-
> owner(at)postgresql(dot)org] On Behalf Of Eric Schwarzenbach
> Sent: Friday, October 31, 2008 12:35 PM
> To: pgsql-general(at)postgresql(dot)org
> Subject: [GENERAL] Wildly erratic query performance
>
> I've got a particular query that is giving me ridiculously erratic
> query
> performance. I have the SQL in a pgadmin query window, and from one
> execution to another, with no changes, the time it takes varies from
> half a second to, well, at least 10 minutes or so at which point I
give
> up an cancel the query. A typical time is 2-3 seconds, but it's all
> over
> the map. I've seen numbers like 112 seconds for one which returns
> without exceeding my patience. In every half a dozen or so execution
> there will be one time which is an order of magnitude bigger than the
> others. A typical series of executions might be something like 2
> seconds, 5 seconds, 8 seconds, 0.5 seconds, 80 seconds, 5 seconds.
>
> Note that the database is running on my local machine, the same
machine
> I'm running the queries from, and nothing else is using this
postgresql
> installation. The data in the database is also not changing--there are
> no inserts or updates happening between queries. I ran a vaccuum
(full,
> analyze) just before I trying these queries. I do monitor my CPU usage
> and there is definitely not some other process on my machine sucking
up
> all the cpu cycles now and then to explain this.
>
> This is postgreslq 8.3, on Windows XP. The query joins about 17 tables
> (without an explicit JOIN, just using the WHERE criteria) with a few
> further predicates. One thing which distinguishes it from other
similar
> queries I've been doing where I haven't seen this odd erraticness is
> there are 2 predicates ORred together (and then ANDed with all the
> other
> conditions which are all ANDed) which effectively divides 2 subsets of
> joined tables which are not joined to each other, but both joined to
> another set of tables. (I don't know if that was a comprehensible way
> of
> explaining this...but I don't know if it's relevant enough to be worth
> explaining in more detail).
>
> I've tried running explain, however the wild erraticness seems to go
> away when I use explain, taking in the ballpark of 1.5 seconds every
> time. This is faster than my average query time using a plain execute,
> even if I don't discount all the unusually long times.
>
> Is there any reasonable explanation for this phenomena?
>
> I do realize I could help the query planner with explicit JOINs,
> however
> I have not yet embarked on this optimization, and might not bother if
> the query performance is acceptable without doing so. I don't expect
> the
> execution plan to be optimal, however I do expect it to be
> deterministic.

Something is missing from your descriptions.
An explain analyze on the query and a list of the schema for the
relevant tables would be helpful.
Are the queries identical? Just changing the where clause a bit can
cause big differences in query speed.

Consider:
SELECT age, status, phone FROM work_force WHERE state IN ('NY', 'CA',
'TX');
May run more slowly than:
SELECT age, status, phone FROM work_force WHERE state IN ('ID', 'MT',
'NV');
Because the first three states have large populations and the last three
states have smaller populations.

Does the database machine run solely as a database server or are there
other things going on? E.g. If you are doing a compile and link of
10,000 source files during one query and the machine is otherwise idle
during a different one, we will expect different results.

There will be (of course) a logical explanation for the query time
differences.

I suggest the following:
1. Do an explain analyze on a query that is slow
2. Do an explain analyze on a query that is fast (unless the queries
are literally identical in every way)
I guess that (along with the schema) will be enough to get an idea what
is happening.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message aravind chandu 2008-10-31 20:18:46 Need Help for a query
Previous Message Eric Schwarzenbach 2008-10-31 19:34:48 Wildly erratic query performance