From: | David Ford <david+cert(at)blue-labs(dot)org> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Optimizing suggestions |
Date: | 2002-06-23 16:57:59 |
Message-ID: | 3D15FE17.7040103@blue-labs.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Tom Lane wrote:
>David Ford <david+cert(at)blue-labs(dot)org> writes:
>
>
>>As stated above, I'd like to fetch a count of rows
>>per day for the last N days.
>>
>>
>
>GROUP BY seems like the obvious answer...
>
So it does,
bmilter=> explain analyze select count(*) from history group by
timestamp::date;
NOTICE: QUERY PLAN:
Aggregate (cost=1784.68..1878.46 rows=1876 width=8) (actual
time=515.60..661.22 rows=8 loops=1)
-> Group (cost=1784.68..1831.57 rows=18755 width=8) (actual
time=510.75..623.29 rows=19310 loops=1)
-> Sort (cost=1784.68..1784.68 rows=18755 width=8) (actual
time=510.73..535.28 rows=19310 loops=1)
-> Seq Scan on history (cost=0.00..453.55 rows=18755
width=8) (actual time=0.05..354.04 rows=19310 loops=1)
Total runtime: 667.70 msec
EXPLAIN
and..
bmilter=> explain analyze select count(*) from history where
timestamp::date > (now()::date - '7 days'::interval) group by
timestamp::date ;
NOTICE: QUERY PLAN:
Aggregate (cost=1175.93..1207.19 rows=625 width=8) (actual
time=1233.34..1349.87 rows=7 loops=1)
-> Group (cost=1175.93..1191.56 rows=6252 width=8) (actual
time=1204.19..1313.24 rows=18693 loops=1)
-> Sort (cost=1175.93..1175.93 rows=6252 width=8) (actual
time=1204.17..1227.90 rows=18693 loops=1)
-> Seq Scan on history (cost=0.00..781.76 rows=6252
width=8) (actual time=8.52..1049.05 rows=18693 loops=1)
Total runtime: 1356.16 msec
EXPLAIN
bmilter=> select count(*) from history where timestamp::date >
(now()::date - '7 days'::interval) group by timestamp::date ;
count
-------
3743
3414
3513
3077
2642
1553
751
(7 rows)
Now.. :) How can I further improve on this? This would be faster if I
used two columns of 'date' and 'time', yes?
>>In the near future I'm going to split this table up into many sub tables
>>
>>
>
>Why? Aren't you just going to have a need for lots of joins, if you
>do that?
>
> regards, tom lane
>
No, actually most of this data will be drawn into focused reporting for
graphing metrics. Only when I drill down to a specific (small) set of
message serial numbers, will I be doing a join of all this data.
In my naivety, which is more performance effective? Most of the time I
will be searching for a count of statistics on a single column.
Thank you for your assistance, knowledge is appreciated,
David
p.s. I am assuming count(*) v.s. count(column) is always faster because
it's doing a generic row count v.s. accumulating stats on a column
inside a row?
From | Date | Subject | |
---|---|---|---|
Next Message | Joe Conway | 2002-06-23 17:08:37 | Re: SELECT problem |
Previous Message | Francisco Reyes | 2002-06-23 16:51:13 | Re: website design |