From: | George Nychis <gnychis(at)cmu(dot)edu> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | help optimizing query |
Date: | 2008-03-22 18:48:07 |
Message-ID: | 47E55467.6070800@cmu.edu |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi all,
I'm looking for helping optimizing a query. It currently requires two
passes on the data per query, when I'd like to try and write it to only
require a single pass.
Here's the high level, it's parsing flow level network traffic and it's
interested in 4 fields:
src_ip, dst_ip, src_packets, dst_packets
In words:
'src_ip' sent 'src_packets' number of packets to 'dst_ip'
'dst_ip' sent 'dst_packets' number of packets to 'src_ip'
For every IP address, I want to count how many packets were sent to it.
This could come one of two ways, the IP is the source in the flow, and
it received dst_packets. Or, the IP is the destination in the flow, and
it received src_packets.
My current method is to split that into two queries and then take the
union. But, I was wondering if its possible to do this in one pass
through the data?
SELECT ip,sum(dst_packets)
FROM(
(SELECT dst_ip AS ip,sum(src_packets) AS dst_packets
FROM flows
WHERE interval='2005-02-01 00:00:00'
GROUP BY dst_ip)
UNION ALL
(SELECT src_ip AS ip,sum(dst_packets) AS dst_packets
FROM flows
WHERE interval='2005-02-01 00:00:00'
GROUP BY src_ip) )
AS stats
GROUP BY stats.ip
HAVING sum(dst_packets)>0
ORDER BY sum(dst_packets) DESC
- George
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Meskes | 2008-03-22 20:11:12 | Re: --enable-thread-safety bug |
Previous Message | Ivan Sergio Borgonovo | 2008-03-22 18:35:52 | Re: dynamically generated SQL and planner/performance |