Hash Aggregate plan picked for very large table == out of memory

From: "Mason Hale" <masonhale(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Hash Aggregate plan picked for very large table == out of memory
Date: 2007-06-14 20:15:03
Message-ID: 8bca3aa10706141315k38d89d74occ317907f68ed54d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

With Postgresql 8.1.9 -- I have a simple group by query:

SELECT target_page_id, min(created_at)
FROM page_page_link
GROUP BY 1;

The page_page_link table has ~130 million rows.

After analyzing the table, the planner picks a hash aggregate plan, which
results in an out of memory error.

crystal=> analyze page_page_link;
ANALYZE
crystal=> explain
crystal-> SELECT target_page_id as page_id, min(created_at) as created_at
crystal-> FROM page_page_link
crystal-> GROUP By 1
crystal-> ;
QUERY
PLAN
-----------------------------------------------------------------------------------
HashAggregate (cost=3663517.88..3670393.09 rows=550017 width=12)
-> Seq Scan on page_page_link (cost=0.00..2993649.92 rows=133973592
width=12)
(2 rows)

The default_statistics_target was originally 200.
I upped it to 1000 and still get the same results.

crystal=> show default_statistics_target;
default_statistics_target
---------------------------
1000
(1 row)

crystal=> set enable_hashagg = off;
SET
crystal=> explain
crystal-> SELECT target_page_id as page_id, min(created_at) as created_at
crystal-> FROM page_page_link
crystal-> GROUP BY 1
crystal-> ;
QUERY
PLAN
-----------------------------------------------------------------------------------------
GroupAggregate (cost=27240841.37..28252518.53 rows=550017 width=12)
-> Sort (cost=27240841.37..27575775.35 rows=133973592 width=12)
Sort Key: target_page_id
-> Seq Scan on page_page_link
(cost=0.00..2993649.92rows=133973592 width=12)
(4 rows)

crystal=>

I am working around this by setting enable_hashagg = off -- but it just
seems like a case where the planner is not picking the strategy?

Is there another setting I can change to help make better decisions?

thanks in advance,

Mason

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Francisco Reyes 2007-06-14 20:17:51 pg_restore out of memory
Previous Message Shoaib Mir 2007-06-14 20:06:43 Re: Function with COPY command?