Quick Links

Re: Solving hash table overrun problems

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Bruno Wolff III <bruno(at)wolff(dot)to>
Cc:	pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: Solving hash table overrun problems
Date:	2005-03-04 15:42:08
Message-ID:	27706.1109950928@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Bruno Wolff III <bruno(at)wolff(dot)to> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> * Estimate the number of batches N using the planner's estimate.
>> We will always choose N a power of 2. A tuple's batch number is
>> ((H div K) mod N).

> If K is way low this could be very slow.

How so? You're not concerned about the time to do the division itself
are you?

>> * Now begin scanning the outer join input. Tuples of batch number
>> zero (according to the current calculation) can be matched to the
>> current hashtable contents. Tuples of higher batch numbers are dropped
>> into holding files for the outer input, one per batch.

> For new keys at this step do you know their final disposition so that making
> new hash entries won't be necessary?

Well, we probably have a pretty fair idea of N at this point, so it'll
usually be right --- but we reserve the right to increase N again later
in case we have to do so because one of the later inner batches is much
bigger than the zero batch we are currently processing.

regards, tom lane

In response to

Re: Solving hash table overrun problems at 2005-03-04 15:46:07 from Bruno Wolff III

Responses

Re: Solving hash table overrun problems at 2005-03-04 16:22:01 from Bruno Wolff III

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bruno Wolff III	2005-03-04 15:46:07	Re: Solving hash table overrun problems
Previous Message	Tom Lane	2005-03-04 15:28:24	Re: bitmap AM design