From: | Brian Hirt <bhirt(at)mobygames(dot)com> |
---|---|
To: | pgsql-hackers(at)postgreSQL(dot)org |
Cc: | bhirt(at)loopy(dot)berkhirt(dot)com |
Subject: | Re: [HACKERS] Slow - grindingly slow - query |
Date: | 1999-11-12 09:49:01 |
Message-ID: | 19991112034901.B21136@loopy.berkhirt.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> > > select accountdetail.domain from accountdetail where
> > > accountdetail.domain not in
> > > (select accountmaster.domain from accountmaster);
>
> This takes more than 5 hours and 30 minutes.
>
> > select accountdetail.domain from accountdetail where
> > not exists (select accountmaster.domain from accountmaster where
> > accountmaster.domain = accountdetail.domain);
>
> This takes 5 seconds - wow!
>
I have a general comment/question here. Why do in/not in clauses seem
to perform so slowly? I've noticed this type of behavior with with my
system also. I think the above queries will always return the exact
same results regardless of the data. From looking at the query plan
with explain, it's clear the second query makes better use of the
indexes. Can't the rewrite engine recognize a simple case like the
one above and rewrite it to use exists and not exists with the proper
joins? Or possibly the optimizer can generate a better plan? Sometimes
it's not so easy to just change a query in the code. Sometimes you can't
change the code because you only have executables and sometimes you are
using a tool that automatically generates SQL using in clauses.
Additionally, since intersect and union get rewritten as in clauses they
suffer the same performance problems.
-brian
--
The world's most ambitious and comprehensive PC game database project.
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 1999-11-12 10:04:03 | Re: psql and \p\g |
Previous Message | Karel Zak - Zakkr | 1999-11-12 09:38:55 | Re: [HACKERS] compression in LO and other fields |