Re: Spilling hashed SetOps and aggregates to disk

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: David Fetter <david(at)fetter(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Jeff Davis <pgsql(at)j-davis(dot)com>
Subject: Re: Spilling hashed SetOps and aggregates to disk
Date: 2018-06-06 14:11:11
Message-ID: 20180606141111.hszzgxmgjqcvskwh@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2018-06-06 16:06:18 +0200, Tomas Vondra wrote:
> On 06/06/2018 04:01 PM, Andres Freund wrote:
> > Hi,
> >
> > On 2018-06-06 15:58:16 +0200, Tomas Vondra wrote:
> > > The other issue is that serialize/deserialize is only a part of a problem -
> > > you also need to know how to do "combine", and not all aggregates can do
> > > that ... (certainly not in universal way).
> >
> > There are several schemes where only serialize/deserialize are needed,
> > no? There are a number of fairly sensible schemes where there won't be
> > multiple transition values for the same group, no?
> >
>
> Possibly, not sure what schemes you have in mind exactly ...
>
> But if you know there's only a single transition value, why would you need
> serialize/deserialize at all. Why couldn't you just finalize the value and
> serialize that?

Because you don't necessarily have all the necessary input rows
yet.

Consider e.g. a scheme where we'd switch from hashed aggregation to
sorted aggregation due to memory limits, but already have a number of
transition values in the hash table. Whenever the size of the transition
values in the hashtable exceeds memory size, we write one of them to the
tuplesort (with serialized transition value). From then on further input
rows for that group would only be written to the tuplesort, as the group
isn't present in the hashtable anymore.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-06-06 14:24:14 Re: buildfarm vs code
Previous Message Tomas Vondra 2018-06-06 14:06:18 Re: Spilling hashed SetOps and aggregates to disk