From: | "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com> |
---|---|
To: | "carter ck" <carterck32(at)hotmail(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Database Select Slow |
Date: | 2007-08-10 15:57:19 |
Message-ID: | dcc563d10708100857o20aeac41k411985f8963d3026@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 8/10/07, carter ck <carterck32(at)hotmail(dot)com> wrote:
> Hi all,
>
> I am facing a performance issue here. Whenever I do a count(*) on a table
> that contains about 300K records, it takes few minutes to complete. Whereas
> my other application which is counting > 500K records just take less than 10
> seconds to complete.
>
> I have indexed all the essential columns and still it does not improve the
> speed.
As previously mentioned, indexes won't help with a count(*) with no
where clause.
They might help with a where clause, if it's quite selective, but if
you're grabbing a noticeable percentage of a table, pgsql will rightly
switch to a seq scan.
Here's some examples from my goodly sized stats db here at work:
\timing
explain select * from businessrequestsummary;
QUERY PLAN
-------------------------------------------------------------------------------------
Seq Scan on businessrequestsummary (cost=0.00..3280188.63
rows=67165363 width=262)
Time: 0.441 ms
gives me an approximate value of 67,165,363 rows.
explain select * from businessrequestsummary where lastflushtime >
now() - interval '1 day';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Index Scan using businessrequestsummary_lastflushtime_dx on
businessrequestsummary (cost=0.00..466.65 rows=6661 width=262)
Index Cond: (lastflushtime > (now() - '1 day'::interval))
says 6661 rows. and takes 0.9 ms and would use the index.
To run the real queries I get much slower times. :)
Now, to run the real count(*) queries:
select count(*) from businessrequestsummary where lastflushtime >
now() - interval '1 day';
count
--------
274192
(1 row)
Time: 546.528 ms
(data in the buffers makes it fast)
select count(*) from businessrequestsummary where lastflushtime >
now() - interval '1 week';
count
---------
1700050
(1 row)
Time: 26291.155 ms
second run (data now in buffer)
select count(*) from businessrequestsummary where lastflushtime >
now() - interval '1 week';
count
---------
1699689
(1 row)
Time: 2592.573 ms
Note the number changed, because this db is constantly being updated
in real time with production statistics.
I'm not going to run a select count(*) on that db, because it would
take about 30 minutes to run. It's got about 67million rows in it.
From | Date | Subject | |
---|---|---|---|
Next Message | Adam Endicott | 2007-08-10 16:15:12 | Re: Confusing performance of specific query |
Previous Message | Greg Smith | 2007-08-10 15:48:28 | Re: UPDATES hang every 5 minutes |