Quick Links

Re: GROUP BY on a large table -- an idea

From:	Markus Schaber <schabi(at)logix-tt(dot)com>
To:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc:	Dawid Kuroczko <qnex42(at)gmail(dot)com>
Subject:	Re: GROUP BY on a large table -- an idea
Date:	2006-10-15 09:16:12
Message-ID:	4531FC5C.20703@logix-tt.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi, Dawid,

Dawid Kuroczko wrote:

> The hybrid approach means: sort as much as you can without spilling to
> disk, then aggregate and store aggregate state variables in safe place
> (like a "tree" above), get more tuples from the table, sort them, update
> aggregate state variables, lather, rince, repeat.

For this to work, you need an additional function in the aggregate
definition, that allows to merge two states into one, for the "update
aggregate state variables" step.

Recently, there was some discussion that the Bizgres MPP people already
have such a function for merging states of different backend processes,
and that the query planner could benefit from such a function e. G. in
case of UNION or table partitioning.

Maybe we should come up with an exact definition of syntax and semantics
of this function, that satisfies all the needs of the three usecases above?

Thanks,
Markus

--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org

In response to

GROUP BY on a large table -- an idea at 2006-10-12 07:52:11 from Dawid Kuroczko

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Markus Schaber	2006-10-15 09:42:45	Re: SQL functions, INSERT/UPDATE/DELETE RETURNING, and
Previous Message	Bruce Momjian	2006-10-15 03:11:10	Re: [HACKERS] large object regression tests