Re: Why hash join instead of nested loop?

From: Rhett Garber <rhettg(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Havasvölgyi Ottó <h(dot)otto(at)freemail(dot)hu>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Why hash join instead of nested loop?
Date: 2005-08-09 19:11:52
Message-ID: 41b0fe8905080912113f381f84@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Bingo, the smaller the sort_mem, the faster that query is.

Thanks a lot to everybody that helped, i'll tweak with these values
more when I get a chance now that I have some guidelines that make
sense.

Rhett

On 8/9/05, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Rhett Garber <rhettg(at)gmail(dot)com> writes:
> > Well that could be an issue, is this abnormally large:
> > #shared_buffers = 1536 # min 16, at least max_connections*2, 8KB each
> > shared_buffers = 206440
> > #sort_mem = 131072 # min 64, size in KB
> > sort_mem = 524288 # min 64, size in KB
> > vacuum_mem = 131072 # min 1024, size in K
>
> The vacuum_mem number is OK I think, but both of the others seem
> unreasonably large. Conventional wisdom about shared_buffers is that
> the sweet spot is maybe 10000 or so buffers, rarely more than 50000.
> (Particularly in pre-8.0 releases, there are code paths that grovel
> through all the buffers linearly, so there is a significant cost to
> making it too large.) Don't worry about it being too small to make
> effective use of RAM --- we rely on the kernel's disk cache to do that.
>
> sort_mem is *per sort*, and so half a gig in a machine with only a
> couple of gig is far too much except when you know you have only one
> query running. A couple dozen backends each trying to use half a gig
> will drive you into the ground in no time. Conventional wisdom here
> is that the global setting should be conservatively small (perhaps
> 10Mb to 100Mb depending on how many concurrent backends you expect to
> have), and then you can explicitly increase it locally with SET for
> specific queries that need it.
>
> In terms of the problem at hand, try the test case with a few different
> values of sort_mem (use SET to adjust it, you don't need to touch the
> config file) and see what happens. I think the cost you're seeing is
> just startup overhead to zero a hash table of a few hundred meg ...
>
> regards, tom lane
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Dan Harris 2005-08-09 20:42:57 Re: Table locking problems?
Previous Message Tom Lane 2005-08-09 19:08:06 Re: Table locking problems?