Re: The Future of Aggregation

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, "amit(dot)kapila(at)enterprisedb(dot)com" <amit(dot)kapila(at)enterprisedb(dot)com>, Simon Riggs <simon(dot)riggs(at)2ndquadrant(dot)com>
Subject: Re: The Future of Aggregation
Date: 2015-06-15 03:55:25
Message-ID: CAKJS1f80FRnwGLMSiHR079P5=EUu8XV4roWmF8OBr8s=gvkKOg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12 June 2015 at 23:57, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote:

> On 11 June 2015 at 01:39, Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:
>
>>
>> One question that arose in my mind running this was whether might
>> be able to combine sum(x) with count(*) if x was NOT NULL, even
>> though the arguments don't match. It might not be worth the
>> gymnastics of recognizing the special case, and I certainly
>> wouldn't recommend looking at that optimization in a first pass;
>> but it might be worth jotting down on a list somewhere....
>>
>>
> I think it's worth looking into that at some stage. I think I might have
> some of the code that would be required for the NULL checking over here ->
> http://www.postgresql.org/message-id/CAApHDvqRB-iFBy68=dCgqS46aRep7AuN2pou4KTwL8kX9YOcTQ@mail.gmail.com
>
> I'm just not so sure what the logic would be to decide when we could apply
> this. The only properties I can see that may be along the right lines are
> pg_proc.pronargs for int8inc and inc8inc_any.
>

Actually a possible realistic solution to this just came to me.

Add another property to pg_aggregate which stores the oid of an alternative
aggregate function which can be used when all arguments to the transition
function can be proved to be not null. This of course would be set to
InvalidOid in most cases.

But at least one good usage case would be COUNT(not_null_expr) would use
COUNT(*).

This likely could not be applied if DISTINCT or ORDER BY were used.

I'm not proposing that we go and do this. More analysis would have to be
done to see if the trade offs between extra code, extra planning time
produce a net win overall. The aggregate code has already gotten quite a
bit more complex over the past couple of releases.

--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2015-06-15 03:55:45 Re: Moving Pivotal's Greenplum work upstream
Previous Message Craig Ringer 2015-06-15 03:44:55 Re: Moving Pivotal's Greenplum work upstream