From: | Steve Clark <sclark(at)netwolves(dot)com> |
---|---|
To: | Daniel Verite <daniel(at)manitou-mail(dot)org> |
Cc: | pgsql <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: best way to query |
Date: | 2008-01-25 17:15:43 |
Message-ID: | 479A193F.8030007@netwolves.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Daniel Verite wrote:
> Steve Clark wrote:
>
>
>>any way i have 2 table - A and B.
>>each table has a key field and if a row is in B it should have a
>>corresponding row in A - but theres
>>the problem it doesn't for all the rows in B.
>>
>>So I want to do something like
>>delete from B where key not in (select key from A order by key);
>>
>>The problem is there are about 1,000,000 rows in A and 300,000 rows
>
> in
>
>>B. I let the above run
>>all night and it was still running the next morning. Does anyone have
>
>
>>an idea of a better way.
>
>
> An outer join is sometimes spectacularly more efficient for this
> particular kind of query.
>
> I'd suggest you try:
>
> delete from B where key in
> (select B.key from B left outer join A on A.key=B.key
> where A.key is null)
>
WOW!
this runs in about 10 seconds - thanks Daniel.
explain select count(*) from t_event_ack_log where event_log_no in
(select t_event_ack_log.event_log_no from t_event_ack_log left outer
join t_unit_event_log a on a.event_log_no=t_event_ack_log.event_log_no
where a.event_log_no is null);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Aggregate (cost=128349.56..128349.57 rows=1 width=0)
-> Hash Join (cost=94512.91..126935.36 rows=565681 width=0)
Hash Cond: (public.t_event_ack_log.event_log_no =
public.t_event_ack_log.event_log_no)
-> Seq Scan on t_event_ack_log (cost=0.00..14759.85
rows=565685 width=4)
-> Hash (cost=92609.85..92609.85 rows=152245 width=4)
-> HashAggregate (cost=91087.40..92609.85
rows=152245 width=4)
-> Hash Left Join (cost=57337.95..90380.29
rows=282842 width=4)
Hash Cond:
(public.t_event_ack_log.event_log_no = a.event_log_no)
Filter: (a.event_log_no IS NULL)
-> Seq Scan on t_event_ack_log
(cost=0.00..14759.85 rows=565685 width=4)
-> Hash (cost=40696.09..40696.09
rows=1014309 width=4)
-> Seq Scan on t_unit_event_log a
(cost=0.00..40696.09 rows=1014309 width=4)
From | Date | Subject | |
---|---|---|---|
Next Message | Steve Clark | 2008-01-25 17:19:18 | Re: best way to query |
Previous Message | Tom Lane | 2008-01-25 17:05:12 | Re: best way to query |