Re: Aggregate transition state merging vs. hypothetical set functions

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Aggregate transition state merging vs. hypothetical set functions
Date: 2017-10-13 08:23:59
Message-ID: 58219c24-ed31-3a3d-db28-ace9390973c5@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10/13/2017 02:08 AM, Tom Lane wrote:
> I started to look into fixing orderedsetaggs.c so that we could revert
> 52328727b, and soon found a rather nasty problem. Although the plain
> OSAs seem amenable to supporting multiple finalfn calls on the same
> transition state, the "hypothetical set" functions are not at all.
> What they do is to accumulate all the regular aggregate input into
> a tuplesort, then add the direct arguments as an additional "hypothetical"
> row, and finally sort the result. There's no realistic way of undoing
> the addition of the hypothetical row, so these finalfns simply can't
> share tuplesort state.
>
> You could imagine accumulating the regular input into a tuplestore
> and then copying it into a tuplesort in each finalfn call. But that
> seems pretty icky, and I'm not sure it'd really be any more performant
> than just keeping separate tuplesort objects for each aggregate.

The current implementation, with the extra flag column, is quite naive.
We could add some support to tuplesort to find a row with given
attributes, and call that instead of actually adding the hypothetical
row to the tuplesort and iterating to re-find it after performsort. For
a result set that fits in memory, you could even do a binary search
instead of linearly iterating through the result set.

I'm not suggesting that we do that right now, though. And even if we
did, there might be other aggregates out there that are not safe to
merge, so we should still add the option to CREATE AGGREGATE. But
that'd be nice little project, if someone wanted to make those functions
faster.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2017-10-13 08:24:27 Re: Discussion on missing optimizations
Previous Message Masahiko Sawada 2017-10-13 08:13:48 Re: Fix a typo in execReplication.c