From: | Michael Blakeley <mike(at)blakeley(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | group by week (ww), plus -S performance |
Date: | 2000-05-27 21:27:17 |
Message-ID: | p04320410b555b1137aae@blakeley.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
I thought I'd pass along a work-around I came up with for the limits
in 'ww' support (7.0 final). Maybe this would be a useful example for
the docs? They're a little lean on date/time examples, IMO. So is the
new book.
Task:
Select a count of records from a table, grouped by the week of the
record. The table is something like
CREATE table EVENTS
(event varchar(128) not null, stamp datetime default now());
And I want the output to use human-readable dates, like
week | count
------------+-------
2000-03-06 | 4
2000-03-13 | 5
2000-03-20 | 3
My immediate solution was something like
SELECT to_date(date_part('year',stamp)||'-'||date_part('week',stamp),
'yyyy-ww'),count(*) FROM EVENTS GROUP BY to_date;
but to_date() doesn't seem to support 'ww' format for text-to-date
translation (not documented, AFAIK).
The solution I eventually found was
SELECT to_date(date_part('year',stamp),'yyyy')-5+7*date_part('week',stamp)
as week,count(*) FROM EVENTS GROUP BY week ORDER BY week;
I haven't tested the '-5' kludge very extensively. It seems to work
ok, for the dates I tested in 2000. I'm sure it would run into
trouble with calendar-change weirdness pre-1900.
I'd also love to hear any suggestions for performance improvements -
it's cpu-bound on my system, and takes about 70 sec to run with
86,000 rows.
The query plan is
Aggregate (cost=9155.76..9584.66 rows=8578 width=20)
-> Group (cost=9155.76..9370.21 rows=85780 width=20)
-> Sort (cost=9155.76..9155.76 rows=85780 width=20)
-> Seq Scan on events (cost=0.00..2126.80 rows=85780 width=20)
The order-by clause doesn't seem to add much overhead - the query
plan is the same with or without it.
I'm running with "-i -N 64 -B 1024 -o '-F'", and I've tried up to '-S
8192' without seeing any noticeable improvement. At higher values,
performance actually went down by almost 50% - something to do with
shmem segment sizes? This is on Solaris 2.6, and I compiled PG7.0
using gcc 2.95.
The -S does reduce disk I/O, but I think that's only about 5% of the
work that's going on (90% user time). An index on 'stamp' doesn't
seem to help, either.
thanks,
-- Mike
From | Date | Subject | |
---|---|---|---|
Next Message | Barry Lind | 2000-05-28 09:37:36 | Arguments not being passed to a function |
Previous Message | Lamar Owen | 2000-05-27 19:50:05 | Re: [GENERAL] SPI & file locations |