Re: work_mem greater than 2GB issue

From: wickro <robwickert(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: work_mem greater than 2GB issue
Date: 2009-05-14 15:47:24
Message-ID: 6aaebc70-64c4-4627-97fc-9aea94f7d143@m24g2000vbp.googlegroups.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

You're right. At a certain work_mem threshold it switches over to a
HashAggregate sort method.
When it does, it eats up alot of memory. For GroupAggregate it only
uses the max of work_mem.

I'm using Postgresql 8.3.3 64bit on Centos 5.
The query I'm running is:

select keyword, partner_id, sum(num_searches) as num_searches
from partner_country_keywords
group by partner_id, keyword

against

create table partner_country_keywords (
keyword varchar,
partner_id integer,
country char(2),
num_searches integer
);

The table partner_country_keywords is 8197 MB large according to
pg_class with 126,171,000 records.

EXPLAIN at work_mem = 2GB:

HashAggregate (cost=3257160.40..3414874.00 rows=12617088 width=28)
-> Seq Scan on partner_country_keywords (cost=0.00..2310878.80
rows=126170880 width=28)

This will continue to eat memory to about 7.2GB or so (on a 8GB
machine)

EXPLAIN at work_mem = 1300MB:

"GroupAggregate (cost=22306417.24..23725839.64 rows=12617088
width=28)"
" -> Sort (cost=22306417.24..22621844.44 rows=126170880 width=28)"
" Sort Key: partner_id, keyword"
" -> Seq Scan on partner_country_keyword
(cost=0.00..2310878.80 rows=126170880 width=28)"

So this is a planning mistake? Should a hash be allowed to grow larger
than work_mem before it starts to use the disk?

On May 14, 4:11 pm, st(dot)(dot)(dot)(at)enterprisedb(dot)com (Gregory Stark) wrote:
> wickro <robwick(dot)(dot)(dot)(at)gmail(dot)com> writes:
> > Hi everyone,
>
> > I have a largish table (> 8GB). I'm doing a very simple single group
> > by on. I am the only user of this database. If I set work mem to
> > anything under 2GB (e.g. 1900MB) the postmaster process stops at that
> > value while it's peforming it's group by. There is only one hash
> > operation so that is what I would expect. But anything larger and it
> > eats up all memory until it can't get anymore (around 7.5GB on a 8GB
> > machine). Has anyone experienced anything of this sort before.
>
> What does EXPLAIN say for both cases? I suspect what's happening is that the
> planner is estimating it will need 2G to has all the values and in fact it
> would need >8G. So for values under 2G it uses a sort and not a hash at all,
> for values over 2G it's trying to use a hash and failing.
>
> --
>   Gregory Stark
>   EnterpriseDB          http://www.enterprisedb.com
>   Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!
>
> --
> Sent via pgsql-general mailing list (pgsql-gene(dot)(dot)(dot)(at)postgresql(dot)org)
> To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-general

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Henry 2009-05-14 17:31:03 Re: work_mem greater than 2GB issue
Previous Message George Kao 2009-05-14 15:36:53 Re: how to extract data from bytea so it is be used in blob for mysql database