From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, David Fetter <david(at)fetter(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Jeff Davis <pgsql(at)j-davis(dot)com> |
Subject: | Re: Spilling hashed SetOps and aggregates to disk |
Date: | 2018-06-07 00:16:11 |
Message-ID: | 3e8fed8a-d62b-5c2f-f6f9-545e4d54eca3@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 06/07/2018 02:11 AM, David Rowley wrote:
> On 7 June 2018 at 08:11, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>> On 06/06/2018 04:11 PM, Andres Freund wrote:
>>> Consider e.g. a scheme where we'd switch from hashed aggregation to
>>> sorted aggregation due to memory limits, but already have a number of
>>> transition values in the hash table. Whenever the size of the transition
>>> values in the hashtable exceeds memory size, we write one of them to the
>>> tuplesort (with serialized transition value). From then on further input
>>> rows for that group would only be written to the tuplesort, as the group
>>> isn't present in the hashtable anymore.
>>>
>>
>> Ah, so you're suggesting that during the second pass we'd deserialize
>> the transition value and then add the tuples to it, instead of building
>> a new transition value. Got it.
>
> Having to deserialize every time we add a new tuple sounds terrible
> from a performance point of view.
>
I don't think Andres was suggesting that. I think the scheme was to sort
the tuples and read all the tuples for a group at once (so we would
deserialize just once).
> Can't we just:
>
> 1. HashAgg until the hash table reaches work_mem.
> 2. Spill the entire table to disk.
> 3. Destroy the table and create a new one.
> 4. If more tuples: goto 1
> 5. Merge sort and combine each dumped set of tuples.
>
... and this is pretty much what Jeff Davis suggested, I think. The
trouble is, each of those cases behaves nicely/terribly in different
corner cases.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2018-06-07 00:17:52 | Re: Spilling hashed SetOps and aggregates to disk |
Previous Message | Jan Claeys | 2018-06-07 00:14:48 | Re: Code of Conduct plan |