Re: question on hash joins

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Hartranft, Robert M(dot) (GSFC-423(dot)0)[RAYTHEON CO]" <robert(dot)m(dot)hartranft(at)nasa(dot)gov>
Cc: "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: question on hash joins
Date: 2017-10-19 16:15:13
Message-ID: 24789.1508429713@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

"Hartranft, Robert M. (GSFC-423.0)[RAYTHEON CO]" <robert(dot)m(dot)hartranft(at)nasa(dot)gov> writes:
> Given that the hash would only contain keys and values needed for supporting
> the query I am having a hard time understanding why I am exceeding the
> 10 GB temp_file_limit.

Because *both sides* of the join are getting dumped to temp files.
This is necessary when the hash requires multiple batches. All
but the first batch of hash keys get written out to files, and then
we reload each batch of the inner relation into the in-memory table
and scan the corresponding batch file from the outer relation.

If you can make work_mem large enough to hold the inner relation
then the problem should go away. Note though that the per-row
overhead is significantly larger in the in-memory representation;
don't have an estimate for that offhand.

regards, tom lane

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Chris Kim 2017-10-19 17:25:30 WAL segement issues on both master and slave server
Previous Message Hartranft, Robert M. (GSFC-423.0)[RAYTHEON CO] 2017-10-19 15:48:05 Re: question on hash joins