From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | postgresql(at)os10000(dot)net |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: parallel query evaluation |
Date: | 2012-11-10 15:32:25 |
Message-ID: | 9130.1352561545@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Oliver Seidel <postgresql(at)os10000(dot)net> writes:
> I have
> create table x ( att bigint, val bigint, hash varchar(30)
> );
> with 693million rows. The query
> create table y as select att, val, count(*) as cnt from x
> group by att, val;
> ran for more than 2000 minutes and used 14g memory on an 8g physical
> RAM machine
What was the plan for that query? What did you have work_mem set to?
I can believe such a thing overrunning memory if the planner chose to
use a hash-aggregation plan instead of sort-and-unique, but it would
only do that if it had made a drastic underestimate of the number of
groups implied by the GROUP BY clause. Do you have up-to-date
statistics for the source table?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Rafał Rzepecki | 2012-11-11 03:18:31 | Planner sometimes doesn't use a relevant index with IN (subquery) condition |
Previous Message | Jeff Janes | 2012-11-09 20:47:37 | Re: [HACKERS] pg_dump and thousands of schemas |