Re: Interesting Query Performance Question

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Jonathan Hoover" <jhoover(at)yahoo-inc(dot)com>
Cc: "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Interesting Query Performance Question
Date: 2010-11-06 15:59:48
Message-ID: 20024.1289059188@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

"Jonathan Hoover" <jhoover(at)yahoo-inc(dot)com> writes:
> [ poor performance with NOT IN ]

> Query E then is apparently the way to go, but shouldn't there be a way
> to get the query planner to take these steps itself? If A had ever
> finished, I'd sure like to have seen an EXPLAIN ANALYZE on it.

Well, just an EXPLAIN would have told you what the plan was like.
What I suspect was happening was that your manipulations of the query
altered the planner's estimate of the number of rows in the NOT IN's
subquery, causing it to pick (or not) a hash-table-based implementation
of NOT IN. The hashed approach is a lot faster but requires the
subquery's output to fit in work_mem.

In general, NOT IN is hard to optimize because of its weird behavior
for NULLs. I'd suggest looking into converting the query to use an
EXISTS instead.

regards, tom lane

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message David W Noon 2010-11-06 16:53:54 Re: Interesting Query Performance Question
Previous Message Jonathan Hoover 2010-11-06 15:01:18 Interesting Query Performance Question