Quick Links

Re: suggestions to improve postgresql suitability for data-mining

From:	"Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>
To:	"Fabien COELHO" <coelho(at)cri(dot)ensmp(dot)fr>, "PostgreSQL Developers" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: suggestions to improve postgresql suitability for data-mining
Date:	2003-07-23 01:33:54
Message-ID:	0efc01c350ba$7ad71140$2800a8c0@mars
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> II) SQL
> -------
>
> The first idea is to ask SQL to do the job with a 'group by' clause:
>
> SELECT area, type, month, SUM(amount), COUNT(*)
> FROM client AS c, invoice AS i
> WHERE c.id=i.client
> GROUP BY area, type, month;
>
> As I am just interested in reading the data, without any transaction, I
> tuned a little bit the database parameters (fsync=false, more shared_mem
> and sort_mem).
>
> It works, but it is quite slow and it requires a lot of disk space.
> Indeed, the result of the join is big, and the aggregation seems to
> require an external sort step so as to sum up data one group after the
> other.
>
> As the resulting table is very small, I wish the optimizer would have
> skipped the sort phase, so as to aggregate the data as they come after the
> join. All may be done on the fly without much additionnal storage (well,
> with some implementation efforts). Maybe it is the "hash evaluation of
> group by aggregates" item listed in the todo list.

As of 7.4CVS it does do this. You will find this much faster in 7.4
release.

Chris

In response to

suggestions to improve postgresql suitability for data-mining at 2003-07-22 16:39:33 from Fabien COELHO

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bruce Momjian	2003-07-23 03:01:11	Re: reprise on Linux overcommit handling
Previous Message	Christopher Kings-Lynne	2003-07-23 01:18:53	Re: php with postgres