Re: Optimizing suggestions

From: David Ford <david+cert(at)blue-labs(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Optimizing suggestions
Date: 2002-06-23 16:57:59
Message-ID: 3D15FE17.7040103@blue-labs.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Tom Lane wrote:

>David Ford <david+cert(at)blue-labs(dot)org> writes:
>
>
>>As stated above, I'd like to fetch a count of rows
>>per day for the last N days.
>>
>>
>
>GROUP BY seems like the obvious answer...
>

So it does,

bmilter=> explain analyze select count(*) from history group by
timestamp::date;
NOTICE: QUERY PLAN:

Aggregate (cost=1784.68..1878.46 rows=1876 width=8) (actual
time=515.60..661.22 rows=8 loops=1)
-> Group (cost=1784.68..1831.57 rows=18755 width=8) (actual
time=510.75..623.29 rows=19310 loops=1)
-> Sort (cost=1784.68..1784.68 rows=18755 width=8) (actual
time=510.73..535.28 rows=19310 loops=1)
-> Seq Scan on history (cost=0.00..453.55 rows=18755
width=8) (actual time=0.05..354.04 rows=19310 loops=1)
Total runtime: 667.70 msec

EXPLAIN

and..

bmilter=> explain analyze select count(*) from history where
timestamp::date > (now()::date - '7 days'::interval) group by
timestamp::date ;
NOTICE: QUERY PLAN:

Aggregate (cost=1175.93..1207.19 rows=625 width=8) (actual
time=1233.34..1349.87 rows=7 loops=1)
-> Group (cost=1175.93..1191.56 rows=6252 width=8) (actual
time=1204.19..1313.24 rows=18693 loops=1)
-> Sort (cost=1175.93..1175.93 rows=6252 width=8) (actual
time=1204.17..1227.90 rows=18693 loops=1)
-> Seq Scan on history (cost=0.00..781.76 rows=6252
width=8) (actual time=8.52..1049.05 rows=18693 loops=1)
Total runtime: 1356.16 msec

EXPLAIN

bmilter=> select count(*) from history where timestamp::date >
(now()::date - '7 days'::interval) group by timestamp::date ;
count
-------
3743
3414
3513
3077
2642
1553
751
(7 rows)

Now.. :) How can I further improve on this? This would be faster if I
used two columns of 'date' and 'time', yes?

>>In the near future I'm going to split this table up into many sub tables
>>
>>
>
>Why? Aren't you just going to have a need for lots of joins, if you
>do that?
>
> regards, tom lane
>

No, actually most of this data will be drawn into focused reporting for
graphing metrics. Only when I drill down to a specific (small) set of
message serial numbers, will I be doing a join of all this data.

In my naivety, which is more performance effective? Most of the time I
will be searching for a count of statistics on a single column.

Thank you for your assistance, knowledge is appreciated,
David

p.s. I am assuming count(*) v.s. count(column) is always faster because
it's doing a generic row count v.s. accumulating stats on a column
inside a row?

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Joe Conway 2002-06-23 17:08:37 Re: SELECT problem
Previous Message Francisco Reyes 2002-06-23 16:51:13 Re: website design