From: | "Anthony Molinaro" <amolinaro(at)wgen(dot)net> |
---|---|
To: | "Hector Rosas" <jeziel(dot)rosas(at)gmail(dot)com>, <pgsql-sql(at)postgresql(dot)org> |
Subject: | Re: Selecting records not present in related tables |
Date: | 2005-10-06 20:11:44 |
Message-ID: | 3C6C2B281FD3E74C9F7C9D5B1EDA45821825D7@wgexch01.wgenhq.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
Jeziel,
there are a couple techniques you can try, two I like are set
difference and anti-joins.
here's the set diff:
select id
from messages
except
select id
from usermessages
that will returns all id from messages not in usermessages
if ID is indexed on both tables, you may wanna try an anti join:
select m.id
from messages m
left join
usermessages um
on ( m.id = um.id )
where um.id is null
both techniques can be visciously efficient.
good luck,
Anthony
-----Original Message-----
From: pgsql-sql-owner(at)postgresql(dot)org
[mailto:pgsql-sql-owner(at)postgresql(dot)org] On Behalf Of Hector Rosas
Sent: Thursday, October 06, 2005 3:44 PM
To: pgsql-sql(at)postgresql(dot)org
Subject: [SQL] Selecting records not present in related tables
Hello, I'm trying to select records in a table not present in a related
table, in example, I've a table with message information (subject,
message, date, etc) and another (usermessages) with where user(s) has
that message, its state, etc. Records in this table will be deleted in a
certain time (just some extra info).
I want to select messages records that aren't present in the other table
(usermessages), I got the next two queries, maybe someone can suggest a
better one.
SELECT m.id FROM messages AS m
WHERE (SELECT count(um.*) FROM usermessages AS um WHERE
um.idmessage=m.id )=0;
SELECT m.id FROM messages AS m where id NOT IN (select um.idmessage FROM
usermessages um);
Both queries work, but doing a EXPLAIN ANALYZE I got the next results.
bd=# explain analyze SELECT m.id FROM messages AS m
bd-# WHERE (SELECT count(um.*) FROM usermessages AS um WHERE
um.idmessage=m.id)=0;
QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
---
Seq Scan on messages m (cost=0.00..3915.75 rows=3 width=4) (actual
time=40.531..40.531 rows=0 loops=1)
Filter: ((subplan) = 0)
SubPlan
-> Aggregate (cost=9.11..9.11 rows=1 width=4) (actual
time=0.098..0.104 rows=1 loops=355)
-> Index Scan using message_selection on usermessages um
(cost=0.00..9.10 rows=3 width=4) (actual time=0.067..0.078 rows=1
loops=355)
Index Cond: (idmessage = $0)
Total runtime: 40.605 ms
(7 rows)
bd=# explain analyze select m.id FROM messages AS m where id NOT IN
(select um.idmessage FROM usermessages um);
QUERY PLAN
------------------------------------------------------------------------
----------------------------------------------
Seq Scan on messages m (cost=9.68..43.00 rows=213 width=4) (actual
time=20.329..20.329 rows=0 loops=1)
Filter: (NOT (hashed subplan))
SubPlan
-> Seq Scan on usermessages um (cost=0.00..8.54 rows=454 width=4)
(actual time=0.008..13.094 rows=454 loops=1)
Total runtime: 20.386 ms
(5 rows)
In first query, cost can be between 0 and almost 4 sec, and also I see
that loops value, that I don't know what performance issues could arise.
In second query, I see a seq scan, which I don't like, I think that with
too many records this query could take ages, or maybe not, but loops
value is 1.
I hope someone can give some advice with those queries , or maybe a
better query. I've not decided which query I'm going to use, thanks!
Jeziel.
From | Date | Subject | |
---|---|---|---|
Next Message | Scott Marlowe | 2005-10-06 20:13:41 | Re: Selecting records not present in related tables |
Previous Message | Hector Rosas | 2005-10-06 19:43:33 | Selecting records not present in related tables |