From: | Ken Hill <ken(at)scottshill(dot)com> |
---|---|
To: | Markus Schaber <schabi(at)logix-tt(dot)com> |
Cc: | pgsql-sql(at)postgresql(dot)org |
Subject: | Re: Non Matching Records in Two Tables |
Date: | 2006-02-14 20:55:07 |
Message-ID: | 1139950507.3083.64.camel@localhost.localdomain |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
On Wed, 2006-02-08 at 22:31 +0100, Markus Schaber wrote:
> Hi, Ken,
>
> Ken Hill schrieb:
> > I need some help with a bit of SQL. I have two tables. I want to find
> > records in one table that don't match records in another table based on
> > a common column in the two tables. Both tables have a column named
> > 'key100'. I was trying something like:
> >
> > SELECT count(*)
> > FROM table1, table2
> > WHERE (table1.key100 != table2.key100);
> >
> > But the query is very slow and I finally just cancel it. Any help is
> > very much appreciated.
>
> Do you have indices on the key100 columns? Is autovacuum running, or do
> you do analyze manually?
>
> Can you send us the output from "EXPLAIN ANALYZE [your query]"?
>
> Btw, I don't think this query will do what you wanted, it basically
> creates a cross product, that means if your tables look like:
>
> schabitest=# select * from table1;
> key100 | valuea | valueb
> --------+--------+--------
> 1 | foo | bar
> 2 | blah | blubb
> 3 | manga | mungo
>
> schabitest=# select * from table2;
> key100 | valuec | valued
> --------+--------+--------
> 1 | monday | euro
> 2 | sunday | dollar
> 4 | friday | pounds
>
> Then your query will produce something like:
> schabitest=# select * from table1, table2 WHERE (table1.key100 !=
> table2.key100);
> key100 | valuea | valueb | key100 | valuec | valued
> --------+--------+--------+--------+--------+--------
> 1 | foo | bar | 2 | sunday | dollar
> 1 | foo | bar | 4 | friday | pounds
> 2 | blah | blubb | 1 | monday | euro
> 2 | blah | blubb | 4 | friday | pounds
> 3 | manga | mungo | 1 | monday | euro
> 3 | manga | mungo | 2 | sunday | dollar
> 3 | manga | mungo | 4 | friday | pounds
>
> I suggest you would like to have all records from table1 that don't have
> a corresponding record in table2:
>
> schabitest=# select * from table1 where table1.key100 not in (select
> key100 from table2);
> key100 | valuea | valueb
> --------+--------+--------
> 3 | manga | mungo
>
> HTH,
> Markus
>
Here is my query SQL:
SELECT key100 FROM ncccr10
WHERE ncccr10.key100 NOT IN (SELECT key100 FROM ncccr9);
It is is running after 30 minutes. Here is the query plan:
QUERY PLAN
-------------------------------------------------------------------------
Seq Scan on ncccr10 (cost=0.00..20417160510.08 rows=305782 width=104)
Filter: (NOT (subplan))
SubPlan
-> Seq Scan on ncccr9 (cost=0.00..65533.71 rows=494471 width=104)
(4 rows)
Any ideas why it is so slow?
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2006-02-14 21:07:29 | Re: Non Matching Records in Two Tables |
Previous Message | Markus Schaber | 2006-02-14 16:39:03 | Re: Slow update SQL |