Rethinking TupleTableSlot deforming

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Rethinking TupleTableSlot deforming
Date: 2016-07-22 01:56:05
Message-ID: 20160722015605.hpthk7axm6sx2mur@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I've previously mentioned (e.g. [1]) that tuple deforming is a serious
bottlneck. I've also experimented successfully [2] making
slot_deform_tuple() faster.

But nontheless, tuple deforming is still a *major* bottleneck in many
cases, if not the *the* major bottleneck.

We could partially address that by JITing the work slot_deform_tuple
does. Various people have, with good but not raving success, played with
that.

Alternatively/Additionally we can change the tuple format to make
deforming faster.

But I think the bigger issue than the above is actually that we're just
performing a lot of useless work in a number of common scenarios. We're
always deforming all columns up to the one needed. Very often that's a
lot of useless work. I've experimented with selectively replacing
slot_getattr calls heap_getattr(), and for some queries that can yield
massive speedups. And obviously significant slowdowns in others. That's
the case even when preceding columns are varlena and/or contain nulls.
I.e. a good chunk of the problem is storing the results of deforming,
not accessing the data.

ISTM, we need to change slots so that they contain information about
which columns are interesting. For the hot paths we'd then only ever
allow access to those columns, and we'd only ever deform them. Combined
with the approach in [2] that allows us to deform tuples a lot more
efficiently.

What I'm basically thinking is that expression evaluation would always
make sure the slots have computed the relevant column set, and deform at
the beginning. There's some cases where we likely would still need to
fall back to a slower path (e.g. whole row refs), but that seems fine.

That then also allows us to nearly always avoid the slot_getattr() call,
and instead look at tts_values/nulls directly. The checks slot_getattr()
performs, and the call itself, are quite expensive.

What I'm thinking about is
a) a new ExecInitExpr()/ExecBuildProjectionInfo() which always compute a set of
interesting columns.
b) replacing all accesses to tts_values/isnull with an inline
function. In optimized builds that functions won't do anything but
reference the relevant element, but in assert enabled builds it'd
check whether said column is actually known to be accessed.
c) Make ExecEvalExpr(), ExecProject(), ExecQual() (and perhaps some
other places) call the new deforming function which ensures the
relevant columns are available.
d) Replace nearly all slot_getattr/slot_getsomeattrs calls with the
function introduced in b).

To me it seems this work will be a good bit easier once [2] is actually
implemented instead of prototyped, because treating ExecInitExpr()
non-recursively allows to build such 'column sets' more easily /
naturally.

Comments? Alternative suggestions?

Greetings,

Andres Freund

[1] http://archives.postgresql.org/20160624232953(dot)beub22r6yqux4gcp(at)alap3(dot)anarazel(dot)de
[2] http://archives.postgresql.org/message-id/20160714011850.bd5zhu35szle3n3c%40alap3.anarazel.de

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Janes 2016-07-22 01:59:57 Re: fixes for the Danish locale
Previous Message Michael Paquier 2016-07-22 00:06:12 Re: Password identifiers, protocol aging and SCRAM protocol