Re: hash join vs nested loop join

From: "Kevin Grittner" <kgrittn(at)mail(dot)com>
To: "Huan Ruan" <huan(dot)ruan(dot)it(at)gmail(dot)com>
Cc: "Huan Ruan" <leohuanruan(at)gmail(dot)com>,pgsql-performance(at)postgresql(dot)org
Subject: Re: hash join vs nested loop join
Date: 2012-12-14 00:54:43
Message-ID: 20121214005443.80090@gmx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Huan Ruan wrote:

> Interesting to see how you derived 100% cache hits. I assume by 'cache' you
> mean the pg shared buffer plus the OS cache? Because the table is 23GB but
> the shared buffer is only 6GB. Even then, I'm not completely convinced
> because the total RAM is just 24GB, part of which will have to be used for
> other data and indexes.

Well, you can't have more than a few disk hits, which typically
take something like 10 ms each, and still get an average less than 200
nanoseconds.

> I read somewhere that a pg shared buffer that's too big can hurt the
> performance and it's better just leave it to the OS cache. I'm not sure why
> but for now, I just configured the shared buffer to be 1/4 of the total RAM.

PostgreSQL goes through the OS and its filesystems, unlike some
other products. The right balance of memory in the PostgreSQL
shared buffers versus what is left for a combination of OS caching
and other memory allocations can be hard to determine. 25% is a
good starting point, but your best performance might be higher or
lower. It's a good idea to try incremental adjustments using your
actual workload. Just remember you need to allow enough for several
maintenance_work_mem allocations, about one work_mem allocation per
max_connections setting, plus a reasonable OS cache size.

> I was wondering on our production server where the effetive_cache_size will
> be much bigger, will pg then guess that probably most data is cached anyway
> therefore leaning towards nested loop join rather than a scan for hash join?

Once effective_cache_size is larger than your largest index, its
exact value doesn't matter all that much.

> Even on a test server where the cache hit rate is much smaller, for a big
> table like this, under what circumstances, will a hash join perform better
> than nested loop join though?

With a low cache hit rate, that would generally be when the number
of lookups into the table exceeds about 10% of the table's rows.

> Yes, I had bumped up work_mem yesterday to speed up another big group by
> query. I used 80MB. I assumed this memory will only be used if the query
> needs it and will be released as soon as it's finished, so it won't be too
> much an issue as long as I don't have too many concurrently sorting queries
> running (which is true in our production). Is this correct?

Each connection running a query can allocate one work_mem
allocation per plan node (depending on node type), which will be
freed after the query completes. A common "rule of thumb" is to
plan on peaks of max_conncetions allocations of work_mem.

> I increased maintenance_work_mem initially to speed up the index creation
> when I first pump in the data. In production environment, we don't do run
> time index creation, so I think only the vacuum and analyze will consume
> this memory?

You'll probably be creating indexes from time to time. Figure an
occasional one of those plus up to one allocation per autovacuum
worker (and you probably shouldn't go below three of those).

-Kevin

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Huan Ruan 2012-12-14 04:46:44 Re: hash join vs nested loop join
Previous Message Claudio Freire 2012-12-14 00:38:53 Re: Why does the number of rows are different in actual and estimated.