Re: segfault in HEAD when too many nested functions call

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: segfault in HEAD when too many nested functions call
Date: 2017-07-18 19:45:53
Message-ID: 3979.1500407153@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> Attached is a trivial patch implementing 1) and a proof-of-concept hack
> for 2).

The comment you made previously, as well as the proposed commit message
for 0001, suggest that you've forgotten that pre-v10 execQual.c had
several check_stack_depth() calls. Per its header comment,

* During expression evaluation, we check_stack_depth only in
* ExecMakeFunctionResult (and substitute routines) rather than at every
* single node. This is a compromise that trades off precision of the
* stack limit setting to gain speed.

and there was also a check in the recursive ExecInitExpr routine.

While it doesn't seem to be documented anywhere, I believe that we'd
argued that ExecProcNode and friends didn't need their own stack depth
checks because any nontrivial node would surely do expression evaluation
which would contain a check.

So the basic issue here is that (1) expression eval, per se, no longer
has any check and (2) it's not clear that we can rely on expression
compilation to substitute for that, since at least in principle
recompilation could be skipped during recursive use of a plan node.

I agree that adding a check to ExecInitNode() is really necessary,
but I'm not convinced that it's a sufficient substitute for checking
in ExecProcNode(). The two flaws I see in that argument are

(a) you've provided no hard evidence backing up the argument that node
initialization will take strictly more stack space than node execution;
as far as I can see that's just wishful thinking.

(b) there's also an implicit assumption that ExecutorRun is called from
a point not significantly more deeply nested than the corresponding
call to ExecutorStart. This seems also to depend on nothing much better
than wishful thinking. Certainly, many ExecutorRun calls are right next
to ExecutorStart, but several of them aren't; the portal code and
SQL-function code are both scary in this respect.

> The latter obviously isn't ready, but might make clearer what I'm
> thinking about. If we were to go this route we'd have to probably move
> the callback assignment into the ExecInit* routines, and possibly
> replace the switch in ExecInitNode() with another callback, assigned in
> make_*, and implement callbacks for ExecShutdown, MultiExecProcNode etc.

While I'm uncomfortable with making such a change post-beta2, I'm one
heck of a lot *more* uncomfortable with shipping v10 with no stack depth
checks in the executor mainline. Fleshing this out and putting it in
v10 might be an acceptable compromise.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-07-18 20:04:10 Re: segfault in HEAD when too many nested functions call
Previous Message Julien Rouhaud 2017-07-18 19:35:43 Re: segfault in HEAD when too many nested functions call