| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
|---|---|
| To: | Simon Riggs <simon(at)2ndquadrant(dot)com> | 
| Cc: | pgsql-hackers(at)postgresql(dot)org | 
| Subject: | Re: Performance gain from reduction of GROUP BY memory allocations | 
| Date: | 2005-08-30 00:25:51 | 
| Message-ID: | 15220.1125361551@sss.pgh.pa.us | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-announce pgsql-hackers | 
Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> I notice that Neil's patch regarding reducing the number of memory
> allocations during aggregation operations isn't mentioned. It was
> originally discussed in 8.0beta (2-3?) time.
> What happened there?
> - patch not committed in the end
> - committed but not mentioned, as a dropped item
> - committed but not mentioned, since part of a larger patch
Are you speaking of these patches?
2005-04-06 19:56 neilc
	* src/backend/utils/adt/: float.c, numeric.c: Apply the "nodeAgg"
	optimization to more of the builtin transition functions. This
	patch optimizes int2_sum(), int4_sum(), float4_accum() and
	float8_accum() to avoid needing to copy the transition function's
	state for each input tuple of the aggregate. In an extreme case
	(e.g. SELECT sum(int2_col) FROM table where table has a single
	column), it improves performance by about 20%. For more complex
	queries or tables with wider rows, the relative performance
	improvement will not be as significant.
2005-04-04 19:50 neilc
	* src/backend/utils/adt/numeric.c: This patch changes
	int2_avg_accum() and int4_avg_accum() use the nodeAgg performance
	hack Tom introduced recently. This means we can avoid copying the
	transition array for each input tuple if these functions are
	invoked as aggregate transition functions.
	
	To test the performance improvement, I created a 1 million row
	table with a single int4 column. Without the patch, SELECT avg(col)
	FROM table took about 4.2 seconds (after the data was cached); with
	the patch, it took about 3.2 seconds. Naturally, the performance
	improvement for a less trivial query (or a table with wider rows)
	would be relatively smaller.
2005-03-12 15:25 tgl
	* contrib/intagg/int_aggregate.c,
	contrib/intagg/int_aggregate.sql.in, doc/src/sgml/xaggr.sgml,
	doc/src/sgml/xfunc.sgml, src/backend/executor/nodeAgg.c,
	src/backend/utils/adt/int8.c: Adjust the API for aggregate function
	calls so that a C-coded function can tell whether it is being used
	as an aggregate or not.  This allows such a function to avoid
	re-pallocing a pass-by-reference transition value; normally it
	would be unsafe for a function to scribble on an input, but in the
	aggregate case it's safe to reuse the old transition value.  Make
	int8inc() do this.  This gets a useful improvement in the speed of
	COUNT(*), at least on narrow tables (it seems to be swamped by I/O
	when the table rows are wide).	Per a discussion in early December
	with Neil Conway.  I also fixed int_aggregate.c to check this,
	thereby turning it into something approaching a supportable
	technique instead of being a crude hack.
I don't recall how Neil's original patch differed from what got
applied...
regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Simon Riggs | 2005-08-30 07:37:04 | Re: Performance gain from reduction of GROUP BY memory | 
| Previous Message | Simon Riggs | 2005-08-29 21:13:11 | Performance gain from reduction of GROUP BY memory allocations | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Luke Lonergan | 2005-08-30 00:33:55 | Re: SHMMAX seems entirely broken in OS X 10.4.2 | 
| Previous Message | Jaime Casanova | 2005-08-30 00:19:21 | lookup fail at DROP USER |