From: | Mike Charnoky <noky(at)nextbus(dot)com> |
---|---|
To: | Gregory Stark <stark(at)enterprisedb(dot)com> |
Cc: | Alban Hertroys <a(dot)hertroys(at)magproductions(dot)nl>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, "A(dot) Kretschmer" <andreas(dot)kretschmer(at)schollglas(dot)com>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: more problems with count(*) on large table |
Date: | 2007-10-01 17:00:21 |
Message-ID: | 470127A5.1060505@nextbus.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
This is strange... count(*) operations over a period of one day's worth
of data now take ~1-2 minutes to run or ~40 minutes. It seems that the
first time the data is queried it takes about 40 minutes. If I try the
query again, it finishes in 1-2 minutes!
Again, nothing else is happening on this db server except for a constant
insertion into this table and a few others. I have done "set statistics
100" for the evtime field in this table.
Here is the output from EXPLAIN ANALYZE. This is the same query run
back to back, first time takes 42 minutes, second time takes less than 2
minutes!
mydb=# explain analyze select count(*) from prediction_accuracy where
evtime between '2007-09-29' and '2007-09-30';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=3.02..3.03 rows=1 width=0) (actual
time=2549854.351..2549854.352 rows=1 loops=1)
-> Index Scan using pred_acc_evtime_index on prediction_accuracy
(cost=0.00..3.02 rows=1 width=0) (actual time=97.676..2532824.892
rows=11423786 loops=1)
Index Cond: ((evtime >= '2007-09-29 00:00:00-07'::timestamp
with time zone) AND (evtime <= '2007-09-30 00:00:00-07'::timestamp with
time zone))
Total runtime: 2549854.411 ms
(4 rows)
Time: 2549943.506 ms
mydb=# explain analyze select count(*) from prediction_accuracy where
evtime between '2007-09-29' and '2007-09-30';
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=3.02..3.03 rows=1 width=0) (actual
time=111200.943..111200.944 rows=1 loops=1)
-> Index Scan using pred_acc_evtime_index on prediction_accuracy
(cost=0.00..3.02 rows=1 width=0) (actual time=36.396..96347.483
rows=11423786 loops=1)
Index Cond: ((evtime >= '2007-09-29 00:00:00-07'::timestamp
with time zone) AND (evtime <= '2007-09-30 00:00:00-07'::timestamp with
time zone))
Total runtime: 111201.000 ms
(4 rows)
Time: 111298.695 ms
Mike
Gregory Stark wrote:
> "Mike Charnoky" <noky(at)nextbus(dot)com> writes:
>
>> I altered the table in question, with "set statistics 100" on the
>> timestamp column, then ran analyze. This seemed to help somewhat. Now,
>> queries don't seem to hang, but it still takes a long time to do the count:
>> * "where evtime between '2007-09-26' and '2007-09-27'"
>> took 37 minutes to run (result was ~12 million)
>> * "where evtime between '2007-09-25' and '2007-09-26'"
>> took 40 minutes to run (result was ~14 million)
>>
>> Still stymied about the seemingly random performance, especially since I
>> have seen this query execute in 2 minutes.
>
>
> And the "explain analyze" for these?
>
> Are you still sure it's certain date ranges which are consistently problems
> and others are consistently fast? Or could it be something unrelated.
>
From | Date | Subject | |
---|---|---|---|
Next Message | Alan Hodgson | 2007-10-01 17:12:05 | Re: more problems with count(*) on large table |
Previous Message | joynes | 2007-10-01 16:56:28 | Find out encoding of data |