From: | Dracula 007 <dracula007(at)atlas(dot)cz> |
---|---|
To: | pgsql-sql(at)postgresql(dot)org |
Subject: | how to speed up these queries ? |
Date: | 2005-03-03 11:05:46 |
Message-ID: | 4226EF8A.3020403@atlas.cz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
Hello,
I have two "large" tables - "sessions" (about 1.500.000 rows) and
"actions" (about 4.000.000 rows), and the "actions" table is connected
to the "sessions" (it contains a key from it). The simplified structure
of these tables is
sessions (
session_id int4,
visitor_id int4,
session_ip inet,
session_date timestamp
)
actions (
action_id int4,
session_id int4, -- foreign key, references sessions(session_id)
action_date timestamp,
action_year int2,
action_month int2,
action_day int2
)
I run SQL queries like
SELECT
COUNT(actions.session_id) AS sessions_count,
COUNT(DISTINCT visitor_id) AS visitors_count,
COUNT(DISTINCT session_ip) AS ips_count
FROM actions LEFT JOIN sessions USING (session_id)
GROUP BY action_year, action_month, action_day
but it's really really slow. I've tried to use different indexes on
different columns, but no matter what I've tried I can't get it faster.
The explain analyze of the query is
--------------------------------------------------
Aggregate (cost=347276.05..347276.05 rows=1 width=23) (actual
time=210060.349..210060.350 rows=1 loops=1)
-> Hash Left Join (cost=59337.55..305075.27 rows=4220077 width=23)
(actualtime=24202.338..119924.254 rows=4220077 loops=1)
Hash Cond: ("outer".session_id = "inner".session_id)
-> Seq Scan on actions (cost=0.00..114714.77 rows=4220077
width=8) (actual time=7539.653..44585.023 rows=4220077 loops=1)
-> Hash (cost=47650.64..47650.64 rows=1484764 width=19)
(actual time=16628.790..16628.790 rows=0 loops=1)
-> Seq Scan on sessions (cost=0.00..47650.64
rows=1484764 width=19) (actual time=0.041..13378.667 rows=1484764 loops=1)
Total runtime: 210061.073 ms
--------------------------------------------------
As you can see it runs for about 4 mins, which is not too fast. Is there
some way to speed up such queries?
From | Date | Subject | |
---|---|---|---|
Next Message | Markus Schaber | 2005-03-03 12:17:25 | Re: Multiples schemas |
Previous Message | Bret Hughes | 2005-03-03 07:27:25 | Re: definative way to place secs from epoc into timestamp |