Expression based grouping equality implementation

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Expression based grouping equality implementation
Date: 2017-11-29 08:09:34
Message-ID: 20171129080934.amqqkke2zjtekd4t@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Similar to [1] (Expression based aggregate transition / combine function
invocation), this patch provides small-medium performance benefits in
order to later enable larger performance benefits with JIT compilation.

What this basically does is to move execGrouping.c functions that
compare tuples for equality (i.e. execTuplesMatch() and the tuplehash
tables themselves) to use the expression evaluation engine. That turns
out to be beneficial on its own because it makes sure we deform all
needed tuples at once, avoid repeated slot_getattr() calls, and because
the branch prediction is better. Instead of, as previously done,
calling execTuplesMatch() one calls ExecQual(), and sets up the tuples
in the ExprContext.

I'm not yet 100% happy with this:
- currently the function building the comparator is named
execTuplesMatchPrepare - which ain't quite apt anymore.
- there's now a bit of additional code at callsites to reset the
ExprContext - was tempted to put that that in a ExecQualAndReset()
inline wrapper, but that's not entirely trivial because executor.h
doesn't include memutils.h and ResetExprContext() is declared late.
- I wonder if the APIs for execGrouping.h should get whacked around a
bit more aggressively - it'd probably be better if the TupleHashTable
struct were created once instead of being recreated during every
rescan.
- currently only equality is done via the expression mechanism, I've for
now just moved execTuplesUnequal() to it's only caller. The semantics
seems so closely matched to NOT IN() that I don't reallyexpect other
users.
- A bunch of the pointers to ExprStates are still named *function - that
seems ok to me, but somebody else might protest.
- some cleanup.

But some comments would be welcome!

I've included the work from [1] here as the patches do conflict.

Greetings,

Andres Freund

[1] https://www.postgresql.org/message-id/20171128003121.nmxbm2ounxzb6n2t@alap3.anarazel.de

Attachment Content-Type Size
0001-Simplify-representation-of-aggregate-transition-valu.patch text/x-diff 10.0 KB
0002-More-efficient-AggState-pertrans-iteration.patch text/x-diff 4.2 KB
0003-Expression-evaluatation-based-agg-transition-invocat.patch text/x-diff 74.3 KB
0004-WIP-Do-execGrouping.c-via-expression-eval-machinery.patch text/x-diff 49.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2017-11-29 08:34:44 Re: explain analyze output with parallel workers - question about meaning of information for explain.depesz.com
Previous Message Thomas Munro 2017-11-29 07:58:25 TupleDescCopy doesn't clear atthasdef, attnotnull, attidentity