Re: Database Select Slow

From: "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>
To: "carter ck" <carterck32(at)hotmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Database Select Slow
Date: 2007-08-10 15:57:19
Message-ID: dcc563d10708100857o20aeac41k411985f8963d3026@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 8/10/07, carter ck <carterck32(at)hotmail(dot)com> wrote:
> Hi all,
>
> I am facing a performance issue here. Whenever I do a count(*) on a table
> that contains about 300K records, it takes few minutes to complete. Whereas
> my other application which is counting > 500K records just take less than 10
> seconds to complete.
>
> I have indexed all the essential columns and still it does not improve the
> speed.

As previously mentioned, indexes won't help with a count(*) with no
where clause.

They might help with a where clause, if it's quite selective, but if
you're grabbing a noticeable percentage of a table, pgsql will rightly
switch to a seq scan.

Here's some examples from my goodly sized stats db here at work:

\timing
explain select * from businessrequestsummary;
QUERY PLAN
-------------------------------------------------------------------------------------
Seq Scan on businessrequestsummary (cost=0.00..3280188.63
rows=67165363 width=262)
Time: 0.441 ms

gives me an approximate value of 67,165,363 rows.

explain select * from businessrequestsummary where lastflushtime >
now() - interval '1 day';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Index Scan using businessrequestsummary_lastflushtime_dx on
businessrequestsummary (cost=0.00..466.65 rows=6661 width=262)
Index Cond: (lastflushtime > (now() - '1 day'::interval))
says 6661 rows. and takes 0.9 ms and would use the index.

To run the real queries I get much slower times. :)

Now, to run the real count(*) queries:

select count(*) from businessrequestsummary where lastflushtime >
now() - interval '1 day';
count
--------
274192
(1 row)

Time: 546.528 ms

(data in the buffers makes it fast)

select count(*) from businessrequestsummary where lastflushtime >
now() - interval '1 week';
count
---------
1700050
(1 row)

Time: 26291.155 ms

second run (data now in buffer)

select count(*) from businessrequestsummary where lastflushtime >
now() - interval '1 week';
count
---------
1699689
(1 row)

Time: 2592.573 ms

Note the number changed, because this db is constantly being updated
in real time with production statistics.

I'm not going to run a select count(*) on that db, because it would
take about 30 minutes to run. It's got about 67million rows in it.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adam Endicott 2007-08-10 16:15:12 Re: Confusing performance of specific query
Previous Message Greg Smith 2007-08-10 15:48:28 Re: UPDATES hang every 5 minutes