From: | Shaul Dar <shauldar(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Finding rows in table T1 that DO NOT MATCH any row in table T2 |
Date: | 2009-10-21 11:52:44 |
Message-ID: | 234efe30910210452i1d1be435p217f74b756bb5cf2@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Tom,
1. Actually I just tested you suggestion
SELECT COUNT (*) FROM T1 where NOT EXISTS
(SELECT 1 FROM T2 where T1.PK <http://t1.pk/> = T2.FK <http://t2.fk/>)
and in worked in PG 8.3.8. On a DB with 6M T1 records and 5M T2 records it
took 1m8s,
My suggestion, i.e.
SELECT COUNT(*) FROM T1 LEFT JOIN T2 ON T1.PK <http://t1.pk/> =
T2.FK<http://t2.fk/>
WHERE T2.FK <http://t2.fk/> IS NULL
was about twice as fast, 37s. (both returned same number of rows, about 2/3
of T1)
However I can use DELETE with your version (instead of "SELECT COUNT (*)"
above) but not with mine (can't have LEFT JOIN in DELETE), so YOU WIN.
Thanks!
2. BTW. I presented my question earlier in an overly simplified fashion.
Sorry. In actuality the two tables are joined on two columns,
say Ka and Kb (a composite key column), e.g. T1.PKa = T2.FKa and T1.PKb =
T2.FKb. So the IN versions suggested will not work
since AFAIK IN only works for a single value.
-- Shaul
On Tue, Oct 20, 2009 at 3:59 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Shaul Dar <shauldar(at)gmail(dot)com> writes:
> > I assume this will work but will take a long time:
>
> > DELETE * FROM T1 where T1.PK NOT IN
> > (SELECT T1.PK FROM T1, T2 where T1.PK = T2.FK)
>
> Well, yeah, but it's unnecessarily inefficient --- why not just
>
> DELETE FROM T1 where T1.PK NOT IN
> (SELECT T2.FK FROM T2)
>
> However, that still won't be tremendously fast unless the subselect fits
> in work_mem. As of 8.4 this variant should be reasonable:
>
> DELETE FROM T1 where NOT EXISTS
> (SELECT 1 FROM T2 where T1.PK = T2.FK)
>
> Pre-8.4 you should resort to the "left join where is null" trick,
> but there's no need to be so obscure as of 8.4.
>
> regards, tom lane
>
From | Date | Subject | |
---|---|---|---|
Next Message | jesper | 2009-10-21 15:03:09 | Random penalties on GIN index updates? |
Previous Message | Melton Low | 2009-10-20 16:14:13 | Re: Finding rows in table T1 that DO NOT MATCH any row in table T2 |