From: | wickro <robwickert(at)gmail(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: work_mem greater than 2GB issue |
Date: | 2009-05-14 15:47:24 |
Message-ID: | 6aaebc70-64c4-4627-97fc-9aea94f7d143@m24g2000vbp.googlegroups.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
You're right. At a certain work_mem threshold it switches over to a
HashAggregate sort method.
When it does, it eats up alot of memory. For GroupAggregate it only
uses the max of work_mem.
I'm using Postgresql 8.3.3 64bit on Centos 5.
The query I'm running is:
select keyword, partner_id, sum(num_searches) as num_searches
from partner_country_keywords
group by partner_id, keyword
against
create table partner_country_keywords (
keyword varchar,
partner_id integer,
country char(2),
num_searches integer
);
The table partner_country_keywords is 8197 MB large according to
pg_class with 126,171,000 records.
EXPLAIN at work_mem = 2GB:
HashAggregate (cost=3257160.40..3414874.00 rows=12617088 width=28)
-> Seq Scan on partner_country_keywords (cost=0.00..2310878.80
rows=126170880 width=28)
This will continue to eat memory to about 7.2GB or so (on a 8GB
machine)
EXPLAIN at work_mem = 1300MB:
"GroupAggregate (cost=22306417.24..23725839.64 rows=12617088
width=28)"
" -> Sort (cost=22306417.24..22621844.44 rows=126170880 width=28)"
" Sort Key: partner_id, keyword"
" -> Seq Scan on partner_country_keyword
(cost=0.00..2310878.80 rows=126170880 width=28)"
So this is a planning mistake? Should a hash be allowed to grow larger
than work_mem before it starts to use the disk?
On May 14, 4:11 pm, st(dot)(dot)(dot)(at)enterprisedb(dot)com (Gregory Stark) wrote:
> wickro <robwick(dot)(dot)(dot)(at)gmail(dot)com> writes:
> > Hi everyone,
>
> > I have a largish table (> 8GB). I'm doing a very simple single group
> > by on. I am the only user of this database. If I set work mem to
> > anything under 2GB (e.g. 1900MB) the postmaster process stops at that
> > value while it's peforming it's group by. There is only one hash
> > operation so that is what I would expect. But anything larger and it
> > eats up all memory until it can't get anymore (around 7.5GB on a 8GB
> > machine). Has anyone experienced anything of this sort before.
>
> What does EXPLAIN say for both cases? I suspect what's happening is that the
> planner is estimating it will need 2G to has all the values and in fact it
> would need >8G. So for values under 2G it uses a sort and not a hash at all,
> for values over 2G it's trying to use a hash and failing.
>
> --
> Gregory Stark
> EnterpriseDB http://www.enterprisedb.com
> Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!
>
> --
> Sent via pgsql-general mailing list (pgsql-gene(dot)(dot)(dot)(at)postgresql(dot)org)
> To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-general
From | Date | Subject | |
---|---|---|---|
Next Message | Henry | 2009-05-14 17:31:03 | Re: work_mem greater than 2GB issue |
Previous Message | George Kao | 2009-05-14 15:36:53 | Re: how to extract data from bytea so it is be used in blob for mysql database |