| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
|---|---|
| To: | Alban Hertroys <alban(at)magproductions(dot)nl> | 
| Cc: | "G(dot) Ralph Kuntz, MD" <grk(at)usa(dot)net>, pgsql-general(at)postgresql(dot)org | 
| Subject: | Re: Q: Table scans on set difference | 
| Date: | 2006-07-14 15:36:23 | 
| Message-ID: | 20529.1152891383@sss.pgh.pa.us | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-general | 
Alban Hertroys <alban(at)magproductions(dot)nl> writes:
> G. Ralph Kuntz, MD wrote:
>> explain select file_name from encounter_properties_table where file_name not
>> in (select filename from xfiles);
> What about:
> explain select file_name from encounter_properties_table
> where not exists (
> 	select file_name from xfiles where filename = file_name);
If you only need the file name, an EXCEPT would probably work much
better:
select file_name from encounter_properties_table
except
select filename from xfiles;
Another possibility is to abuse the outer join machinery:
select file_name, ... from 
	encounter_properties_table l left join xfiles r
	  on l.file_name = r.filename
where r.filename is null;
Generally speaking, NOT IN performance is going to suck unless the
sub-select is small enough to fit in a hashtable.  You could consider
increasing work_mem enough that it would fit, but with 500K filenames
needed, that's probably not going to win.
regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2006-07-14 15:48:25 | Re: apparent wraparound | 
| Previous Message | Florian Weimer | 2006-07-14 15:31:38 | Re: Timestamp vs timestamptz |