From: | Tamas MEZEI <tamas(at)bazmag(dot)hu> |
---|---|
To: | pgsql-novice(at)postgresql(dot)org |
Subject: | summarizing traffic logs |
Date: | 2004-10-11 00:06:51 |
Message-ID: | 4169CE9B.9010403@bazmag.hu |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-novice |
Hi,
I've just started to use PgSQL, and having some trouble with aggregation
in SQL.
I have a huge-and-growing table described as:
current_traffic (
from_timestamp timestamp without timezone,
to_timestamp timestamp without timezone,
source_host inet,
dest_host inet,
flow_size bigint,
flow_kind smallint,
flow_proto smallint
);
The input is coming from a named pipe, and inserted into the table by a
perl script. I hope the field names are speaking from themselves, but
some additional info:
- source_host and dest_host: we have a campus network in the subnet
x.y.z.0/21 and an university network in the range of x.y.0.0/16, and
this will be inportant on summarizing
- flow_kind is a smallint value with discrete numbers (ie. it can be
one of {1,2,3,4}) and contains the meaning "this flow is web traffic",
"this is mail" etc.
- flow proto can be 6 or 17 (tcp and udp).
Input is coming like crazy (~60000 rows in 15 mins, we have ~2000 hosts)
and I'd like to do some aggregation in every 15 mins to a table like
below, and then truncate the current_traffic log table. (Is it ok, and
is truncating "atomically"?)
quarterly_sum (
from_timestamp timestamp without timezone,
to_timestamp timestamp without timezone,
host inet,
tcp_to_uni_kind1 bigint,
tcp_to_uni_kind2 bigint,
tcp_to_uni_kind3 bigint,
tcp_to_uni_kind4 bigint,
tcp_from_uni_kind1 bigint,
tcp_from_uni_kind2 bigint,
tcp_from_uni_kind3 bigint,
tcp_from_uni_kind4 bigint,
tcp_to_world_kind1 bigint,
tcp_to_world_kind2 bigint,
tcp_to_world_kind3 bigint,
tcp_to_world_kind4 bigint,
tcp_from_world_kind1 bigint,
tcp_from_world_kind2 bigint,
tcp_from_world_kind3 bigint,
tcp_from_world_kind4 bigint,
udp_to_uni_kind1 bigint,
udp_to_uni_kind2 bigint,
udp_to_uni_kind3 bigint,
udp_to_uni_kind4 bigint,
udp_from_uni_kind1 bigint,
udp_from_uni_kind2 bigint,
udp_from_uni_kind3 bigint,
udp_from_uni_kind4 bigint,
udp_to_world_kind1 bigint,
udp_to_world_kind2 bigint,
udp_to_world_kind3 bigint,
udp_to_world_kind4 bigint,
udp_from_world_kind1 bigint,
udp_from_world_kind2 bigint,
udp_from_world_kind3 bigint,
udp_from_world_kind4 bigint
);
Classifying "world" and "university" traffic is quite easy with PgSQL
inet functions, but how should i create the aggregations grouped by the
kind of the flow?
I had some thoughts about creating some views, or using
triggers/cursors, but I'm not that deep in PgSQL to fully understand
every bit.
If anybody would help me solving this problem, that would be highly
appreciated.
Thanks,
Tamas
From | Date | Subject | |
---|---|---|---|
Next Message | Number One | 2004-10-11 08:10:02 | Q: parameters to functions |
Previous Message | V i s h a l Kashyap @ [Sai Hertz And Control Systems] | 2004-10-09 16:15:31 | View or Function as default field |