Optimizing suggestions

From: David Ford <david+cert(at)blue-labs(dot)org>
To: pgsql-general(at)postgresql(dot)org
Subject: Optimizing suggestions
Date: 2002-06-22 01:02:13
Message-ID: 3D13CC95.7010109@blue-labs.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Ok, I have a table which is currently a bit of a conglomerate. I intend
to break it down into multiple tables as is more benefiting an RDBMS but
here is the scoop so far. I have a table that I need to count up
statistics on it based on the timestamp of the entry.

bmilter=> \d history ;
Table "history"
Column | Type | Modifiers
-----------------+--------------------------+---------------
bmilter_host_id | inet | not null
timestamp | timestamp with time zone | default now()
size | bigint | not null
content_type | smallint | default '0'
message_id | character varying | not null
Indexes: history_idx

The index is of a btree type on column timestamp.

What I need to do is get a count of the number of rows for a given day
for the last N days. Typically 7, 14, 1mo, etc. At present this takes
just under one second per query, ~680ms. Each query returns the count
of one day. So for a week of data, this comes out to 7 queries and
about 4 seconds worth of time. An example query for two days ago is:

bmilter=> select count(*) from history where timestamp::date =
now()::date - '2 days'::interval;
count
-------
3513

I am currently accumulating about 4K rows per day but I expect this to
nearly double. I'm looking for enlightenment, suggestions on how to
improve this etc. As stated above, I'd like to fetch a count of rows
per day for the last N days.

The explain return is here:

bmilter=> explain analyze select count(*) from history where
timestamp::date = now()::date - '2 days'::interval;
NOTICE: QUERY PLAN:

Aggregate (cost=690.66..690.66 rows=1 width=0) (actual
time=674.95..674.95 rows=1 loops=1)
-> Seq Scan on history (cost=0.00..690.46 rows=83 width=0) (actual
time=219.03..663.33 rows=3513 loops=1)
Total runtime: 675.11 msec

In the near future I'm going to split this table up into many sub tables
and introduce several more columns, i.e.:

create table hx_id (
serial int8,
bmilter_host_id inet NOT NULL,
timestamp timestamp DEFAULT 'now'::timestamp primary key
);

create table hx_stats (
serial int8,
size int8 DEFAULT '0',
content_type int2 DEFAULT '0',
message_id character varying
);

create table hx_recipients (
serial int8,
recipients character varying
);

create table hx_spam_tagged (
serial int8,
spam_id int8 references spam_ids (id) on update cascade
);

I believe I've laid it out now, so hit me with it if you care to. What
can I do to improve the speed with which the queries are done? I don't
have the fastest of machines, so I do need to tune as best can. Normal
postmaster tuning aside, I've already increased the shared buffers etc
but would touching any of the optimizer section help?. I'm needing
improvement on my concepts I'm sure.

Thank you,
David

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Marc G. Fournier 2002-06-22 01:07:07 Re: Download version on website
Previous Message Johnson Ma 2002-06-22 00:28:30 Need help on index!!!