Quick Links

Re: hashagg, statistisics and excessive memory allocation

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Cc:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Subject:	Re: hashagg, statistisics and excessive memory allocation
Date:	2006-05-11 19:04:20
Message-ID:	200605111204.21771.josh@agliodbs.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Stefan,

> Some testing showed that the planner was seriously underestimating the
> number of distinct rows in the table (with the default statistic target
> it estimated ~150k rows while there are about 19M distinct values) and
> chosing a hashagg for the aggregate.
> uping the statistics target to 1000 improves the estimate to about 5M
> rows which unfortunably is still not enough to cause the planner to
> switch to a groupagg with work_mem set to 256000.

Well, it's pretty well-known that we need to fix n-distinct estimation.
But we also need to handle it gracefully if the estimate is still wrong
and we start using too much memory. Is there any way we can check how
much memory the hashagg actually *is* using and spill to disk if it goes
over work_mem?

--
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

In response to

hashagg, statistisics and excessive memory allocation at 2006-05-11 18:36:25 from Stefan Kaltenbrunner

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Mark Campbell	2006-05-11 19:11:03	Re: Compiling on 8.1.3 on Openserver 5.05
Previous Message	Martijn van Oosterhout	2006-05-11 18:43:46	Re: [HACKERS] Big IN() clauses etc : feature proposal