From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
Cc: | Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Олег Царев <zabivator(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> |
Subject: | Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities) |
Date: | 2009-05-12 13:27:22 |
Message-ID: | 603c8f070905120627g5446fe09wfd1300c49f5d8fef@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, May 12, 2009 at 2:21 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> Moreover, I guess you don't even need to buffer tuples to aggregate by
>> different keys. What you have to do is only to prepare more than one
>> hash tables (, or set up sort order if the plan detects hash table is
>> too large to fit in the memory), and one time seq scan will do. The
>> trans values are only to be stored in the memory, not the outer plan's
>> results. It will win greately in performance.
>
> it was my first solution. But I would to prepare one non hash method.
> But now I thinking about some special executor node, that fill all
> necessary hash parallel. It's special variant of hash agreggate.
I think HashAggregate will often be the fastest method of executing
this kind of operation, but it would be nice to have an alternative
(such as repeatedly sorting a tuplestore) to handle non-hashable
datatypes or cases where the HashAggregate would eat too much memory.
But that leads me to a question - does the existing HashAggregate code
make any attempt to obey work_mem? I know that the infrastructure is
present for HashJoin/Hash, but on a quick pass I didn't notice
anything similar in HashAggregate.
And on a slightly off-topic note for this thread, is there any
compelling reason why we have at least three different hash
implementations in the executor? HashJoin/Hash uses one for regular
batches and one for the skew batch, and I believe that HashAggregate
does something else entirely. It seems like it might improve code
maintainability, if nothing else, to unify these to the extent
possible.
...Robert
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2009-05-12 14:34:10 | Re: New trigger option of pg_standby |
Previous Message | Robert Haas | 2009-05-12 13:19:30 | Re: DROP TABLE vs inheritance |