From: | Greg Stark <gsstark(at)mit(dot)edu> |
---|---|
To: | Mark Kirkwood <markir(at)paradise(dot)net(dot)nz> |
Cc: | Luke Lonergan <llonergan(at)greenplum(dot)com>, stange(at)rentec(dot)com, Greg Stark <gsstark(at)mit(dot)edu>, Dave Cramer <pg(at)fastcrypt(dot)com>, Joshua Marsh <icub3d(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Hardware/OS recommendations for large databases ( |
Date: | 2005-11-24 16:00:25 |
Message-ID: | 87d5kqkpd2.fsf@stark.xeocode.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Mark Kirkwood <markir(at)paradise(dot)net(dot)nz> writes:
> Yeah - it's pretty clear that the count aggregate is fairly expensive wrt cpu -
> However, I am not sure if all agg nodes suffer this way (guess we could try a
> trivial aggregate that does nothing for all tuples bar the last and just
> reports the final value it sees).
As you mention count(*) and count(1) are the same thing.
Last I heard the reason count(*) was so expensive was because its state
variable was a bigint. That means it doesn't fit in a Datum and has to be
alloced and stored as a pointer. And because of the Aggregate API that means
it has to be allocated and freed for every tuple processed.
There was some talk of having a special case API for count(*) and maybe
sum(...) to avoid having to do this.
There was also some talk of making Datum 8 bytes wide on platforms where that
was natural (I guess AMD64, Sparc64, Alpha, Itanic).
Afaik none of these items have happened but I don't know for sure.
--
greg
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2005-11-24 16:25:28 | Re: Hardware/OS recommendations for large databases ( |
Previous Message | Tom Lane | 2005-11-24 15:37:21 | Re: xlog flush request error |