Quick Links

Re: huge price database question..

From:	Lee Hachadoorian <Lee(dot)Hachadoorian+L(at)gmail(dot)com>
To:	pgsql-general(at)postgresql(dot)org
Subject:	Re: huge price database question..
Date:	2012-03-21 16:45:43
Message-ID:	CANnCtnJLK3padcJkv5Y8ieo5R5V_fiu-82CX2SaDQTJxkMy6iQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Tue, Mar 20, 2012 at 11:28 PM, Jim Green
<student(dot)northwestern(at)gmail(dot)com>wrote:

> On 20 March 2012 22:57, John R Pierce <pierce(at)hogranch(dot)com> wrote:
>
> > avg() in the database is going to be a lot faster than copying the data
> into
> > memory for an application to process.
>
> I see..
>

As an example, I ran average on a 700,000 row table with 231 census
variables reported by state. Running average on all 231 columns grouping by
state inside Postgres beat running it by R by a factor of 130 NOT COUNTING
an additional minute or so to pull the table from Postgres to R. To be
fair, these numbers are not strictly comparable, because it's running on
different hardware. But the setup is not atypical: Postgres is running on a
heavy hitting server while R is running on my desktop.

SELECT state, avg(col1), avg(col2), [...] avg(col231)
FROM some_table
GROUP BY state;

5741 ms

aggregate(dfSomeTable, by = list(dfSomeTable$state), FUN = mean, na.rm =
TRUE)

754746 ms

--Lee

In response to

Re: huge price database question.. at 2012-03-21 03:28:20 from Jim Green

Responses

Re: huge price database question.. at 2012-03-21 16:57:02 from Andy Colson

Browse pgsql-general by date

	From	Date	Subject
Next Message	Adrian Klaver	2012-03-21 16:52:48	Re: huge price database question..
Previous Message	Jim Green	2012-03-21 16:34:38	Re: huge price database question..