Declaring a strict function returns not null / eval speed

From: Andres Freund <andres(at)anarazel(dot)de>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Declaring a strict function returns not null / eval speed
Date: 2019-10-01 07:38:50
Message-ID: B3973F0E-5505-480A-BAFE-36C458472794@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

We spend a surprising amount of time during expression evaluation to reevaluate whether input to a strict function (or similar) is not null, even though the value either comes from a strict function, or a column declared not null.

Now you can rightfully say that a strict function still can return NULL, even when called with non-NULL input. But practically that's quite rare. Most of the common byvalue type operators are strict, and approximately none of those return NULL when actually called.

That makes me wonder if it's worthwhile to invent a function property declaring strict strictness or such. It'd allow for some quite noticable improvements for e.g. queries aggregating a lot of rows, we spend a fair time checking whether the transition value has "turned" not null. I'm about to submit a patch making that less expensive, but it's still expensive.

I can also imagine that being able to propagate NOT NULL further up the parse-analysis tree could be beneficial for planning, but I've not looked at it in any detail.

A related issue is that we, during executor initialization, currently "loose" information about a column's NOT NULLness just above the lower scan nodes. Efficiency wise that's a substantial loss for many realistic queries: For JITed deforming that basically turns a bunch of mov instructions with constant offsets into much slower attribute by attribute trawling through the tuple. The latter can approximately not take advantage of the superscalar nature of just about any relevant processor. And for non JITed execution an expression step that used a cheaper deforming routine for the cases where only leading not null columns are accessed would also yield significant speedups. This is made worse by the fact that we often not actually deform at the scan nodes, due to the physical tlist optimization. This is especially bad for nodes storing tuples as minimal tuples (e.g. hashjoin, hashagg), where often a very significant fraction of time of spent re-deforming columns that already were deformed earlier.

It doesn't seem very hard to propagate attnotnull upwards in a good number of the cases. We don't need to do so everywhere for it to be beneficial.

Comments?

Andres

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Smith, Peter 2019-10-01 07:55:26 Proposal: Make use of C99 designated initialisers for nulls/values arrays
Previous Message Tels 2019-10-01 07:32:56 Re: Transparent Data Encryption (TDE) and encrypted files