From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
Cc: | Rod Taylor <pg(at)rbt(dot)ca>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Large Scale Aggregation (HashAgg Enhancement) |
Date: | 2006-01-17 14:52:10 |
Message-ID: | 28124.1137509530@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> On Mon, 2006-01-16 at 20:02 -0500, Tom Lane wrote:
>> But our idea of the number of batches needed can change during that
>> process, resulting in some inner tuples being initially assigned to the
>> wrong temp file. This would also be true for hashagg.
> So we correct that before we start reading the outer table.
Why? That would require a useless additional pass over the data. With
the current design, we can process and discard at least *some* of the
data in a temp file when we read it, but a reorganization pass would
mean that it *all* goes back out to disk a second time.
Also, you assume that we can accurately tell how many tuples will fit in
memory in advance of actually processing them --- a presumption clearly
false in the hashagg case, and not that easy to do even for hashjoin.
(You can tell the overall size of a temp file, sure, but how do you know
how it will split when the batch size changes? A perfectly even split
is unlikely.)
> OK, I see what you mean. Sounds like we should have a new definition for
> Aggregates, "Sort Insensitive" that allows them to work when the input
> ordering does not effect the result, since that case can be optimised
> much better when using HashAgg.
Please don't propose pushing this problem onto the user until it's
demonstrated that there's no other way. I don't want to become the
next Oracle, with forty zillion knobs that it takes a highly trained
DBA to deal with.
> But all of them sound ugly.
I was thinking along the lines of having multiple temp files per hash
bucket. If you have a tuple that needs to migrate from bucket M to
bucket N, you know that it arrived before every tuple that was assigned
to bucket N originally, so put such tuples into a separate temp file
and process them before the main bucket-N temp file. This might get a
little tricky to manage after multiple hash resizings, but in principle
it seems doable.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2006-01-17 15:03:35 | Re: [GENERAL] [PATCH] Better way to check for getaddrinfo function. |
Previous Message | Magnus Hagander | 2006-01-17 14:44:30 | Re: Docs off on ILIKE indexing? |