Quick Links

Question with hashed IN

From:	Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>
To:	<pgsql-hackers(at)postgresql(dot)org>
Cc:	<tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Question with hashed IN
Date:	2003-08-17 03:38:55
Message-ID:	20030816201948.Y75834-100000@megazone.bigpanda.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

I've noticed that when the stats are wrong (like
in cases where you've loaded data but reltuples
hasn't been updated yet) that a hashed NOT IN
seems to take a significant time penalty. Is
this to be expected?

On a pktest table with 1 million integers and a dual table with a single
integer and sort_mem set high enough to give a hashed subplan for the
various reltuples values, I saw the following behavior for

explain analyze select * from dual where a not in (select a from pktest);

with reltuples=1000 for pktest, query takes about 96 seconds
reltuples=10000, query takes about 15 seconds
reltuples=100000, query takes about 8 seconds

And the memory usage seemed to be the same even if I set sort_mem back
to 1024.

Responses

Re: Question with hashed IN at 2003-08-17 03:49:27 from Stephan Szabo

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Stephan Szabo	2003-08-17 03:49:27	Re: Question with hashed IN
Previous Message	Bruce Momjian	2003-08-17 03:33:14	Re: [PATCH] Re: [pgsql-advocacy] Why READ ONLY transactions?