Re: Question with hashed IN

From: Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Question with hashed IN
Date: 2003-08-17 04:55:15
Message-ID: 20030816214853.A77643-100000@megazone.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Sun, 17 Aug 2003, Tom Lane wrote:

> Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com> writes:
> >> with reltuples=1000 for pktest, query takes about 96 seconds
> >> reltuples=10000, query takes about 15 seconds
> >> reltuples=100000, query takes about 8 seconds
>
> > Errm, I meant in the cases where it still chose a hashed
> > subplan. Stupid cold medicine.
>
> I'm confused too. Please explain again when you're feeling better...

Basically, the first thing I noticed was that changing reltuples
on the pg_class row for a table affected the speed of

explain analyze select * from othertable where foo not in (select bar from
table);

even when the plan wasn't changing, seqscan + filter on hashed subquery.
I thought that was kind of odd since the plan didn't seem any different,
but the real world time changed by about a factor of 10.

Then I noted that changing sort_mem changed the point at which it would
choose a hashed subquery in the initial plan based on the estimated
tuples, but didn't seem to actually affect the real memory usage, which
means that a table with a few million rows but reltuples still set at 1000
would eat up a very large amount of memory (in my case it sent my machine
a few hundred megs into swap).

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2003-08-17 05:00:01 Re: Parsing speed (was Re: pgstats_initstats() cost)
Previous Message Bruce Momjian 2003-08-17 04:43:50 Re: WITH HOLD and pooled connections