From: | "Worky Workerson" <worky(dot)workerson(at)gmail(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Large IN query optimization |
Date: | 2006-12-13 18:22:49 |
Message-ID: | ce4072df0612131022q6831e698u154d1133788287e1@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
I'm wondering (given what I have read about PG and IN), if there is a
better way to execute this query. I have (essentially) two tables in
a read-only DB, ip_info and events, where many events can map to one
ip_info. I'd like to get all the ip_info.info columns where a
specific event occurred. This is about what I have come up with (with
indexes on all the columns):
CREATE TABLE ip_info ( ip IP4, info VARCHAR );
CREATE TABLE events ( ip IP4, event_name VARCHAR, event_type VARCHAR);
SELECT ip, info
FROM ip_info
WHERE ip IN (SELECT ip FROM events WHERE event_name = 'somename');
This works fine when there are few events named 'somename', however
this subquery can return a fairly large set of rows (>1 million) and
this query takes quite long to execute. Is there a better way to
write this query? What would be the optimal plan for this query,
given a roughly even distribution of event_name? My current plan
looks something like (as I have partitioned the events table by ip):
Hash Join
Hash Cond ("outer".ip = "inner".ip)
-> Seq Scan on ip_info
-> Hash
-> HashAggregate
-> Append
-> Index Scan using "events_ip_01_event_name_idx" on events_ip_01 events
Index Cond ...
-> Index Scan using "events_ip_02_event_name_idx" on events_ip_02 events
Index Cond ...
Is this the optimal plan for this query? BTW, ip_info has about 5M
rows, and the collective events tables have about 50M rows.
Also, slightly off-topic ... are there any optimizations that I can
use to tell PG that this is a read-only database?
PG 8.1.3 on RHEL4.3 x86_64 ... thinking about upgrading to 8.2 when I
get the time.
Thanks!
From | Date | Subject | |
---|---|---|---|
Next Message | Russ Brown | 2006-12-13 18:27:42 | Re: MySQL drops support for most distributions |
Previous Message | John D. Burger | 2006-12-13 18:20:00 | Re: MySQL drops support for most distributions |