From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Letting plpgsql in on the fun with the new expression eval stuff |
Date: | 2017-12-20 17:12:48 |
Message-ID: | 27674.1513789968@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2017-12-19 13:00:41 -0500, Tom Lane wrote:
>> I'm looking at ways to get plpgsql expression evaluation to go faster,
>> and one thing I'm noticing is the rather large overhead of going through
>> ExecEvalParamExtern and plpgsql_param_fetch to get to the useful work
>> (exec_eval_datum).
> What's the workload you're testing? I'm mildly surprised to see
> ExecEvalParamExtern() show up, rather than just plpgsql_param_fetch() &
> exec_eval_datum(). Or were you just listing that to specify the
> callpath?
I'm using several different test cases, but one that shows up the problem
is
create table foo(a int, b int, c int);
insert into foo select 1,2,3 from generate_series(1,1000000);
vacuum foo;
create or replace function ptest4_rec() returns void as $$
declare r record; t int;
begin
for r in select * from foo loop
t := r.a + r.b + r.c;
end loop;
end$$
language plpgsql stable;
-- then trace this:
do $$
begin
for i in 1..200 loop
perform ptest4_rec();
end loop;
end; $$ language plpgsql;
The outer do-block is just to get a run long enough for reliable perf
statistics. I find these relevant entries in the report:
Children Self Samples Shared Object Symbol
+ 98.29% 4.44% 10827 postgres [.] ExecInterpExpr
+ 61.82% 3.23% 7878 plpgsql.so [.] exec_eval_expr
+ 34.64% 3.91% 9521 postgres [.] ExecEvalParamExtern
+ 29.78% 4.88% 11892 plpgsql.so [.] plpgsql_param_fetch
+ 23.06% 6.90% 16831 plpgsql.so [.] exec_eval_datum
+ 6.28% 2.89% 7049 plpgsql.so [.] setup_param_list
+ 4.02% 4.01% 9774 postgres [.] bms_next_member
I think the ridiculous "children" percentage for ExecInterpExpr can be
discounted, because that's a result of essentially everything happening
inside the outer call of ptest4_rec. But the rest of it can be taken
at face value, and it says that ExecEvalParamExtern and
plpgsql_param_fetch are each pretty expensive; and both of them are just
interface routines doing little useful work.
>> We've ameliorated that for DTYPE_VAR variables
>> by keeping a pre-set-up copy of their values in a ParamListInfo struct,
>> but that's pretty ugly and carries a bunch of costs of its own.
> Just to make sure I understand correctly, you're talking about
> setup_unshared_param_list() / setup_param_list()?
Right, and the stuff in e.g. assign_simple_var() to keep the paramlist
up to date.
>> So the execution would look more or less like
>> op->eval_param(state, op, econtext);
>> and there'd need to be some extra fields (at least a void*) in the op
>> struct where plpgsql could keep private data.
> I think I'd redo the parameters to the callback slightly, but generally
> that sounds sane. Was thinking of something more like
Um, you left out something here? I don't mind changing the callback
signature, but it seems like it generally ought to look the same as
the other out-of-line eval functions.
> One question I have is how we will re-initialize the relevant state
> between exec_simple_expr() calls. I guess the most realistic one is that
> the op will have a pointer into an array managed by exec_simple_expr()
> that can get reset?
I'm not seeing anything that needs to be reset?
> Hm. We could have a version of ExecInitExpr() / ExecInitExprRec that
> first gives a callback chance to handle an operation. That'd make
> overwriting parts like this quite easy, because we know we want to
> handle Param nodes anywhere differently. So we'd not have a Param
> specific routine, but the ability to intercept everything, and defer to
> ExecInitExprRec() if undesired.
Yeah, I thought of that idea too, but at least for what I'm trying to
do here, it doesn't seem all that helpful. The problem is that plpgsql
needs to get its hooks into not only its own direct calls of ExecInitExpr
for "simple" expressions, but also calls that occur within arbitrary plan
trees that are parsed/planned/executed through SPI. And then things need
to happen entirely differently in parallel child workers, which will have
"flat" copies of the ParamListInfo. So I really want a hook that's tied
to ParamListInfo, and that only makes much sense for PARAM_EXTERN Params.
There may be use-cases for the more general hook you're talking about,
but I'm not very clear where that would be set up.
>> Another way we could do it is to invent ExecInitExprWithParams() that
>> takes an additional ParamListInfo pointer, and use that. Rather than
>> adding yet one more parameter that has to be passed down through
>> ExecInitExprRec, I suggest that we could waste a bit of space in
>> struct ExprState and store that value there. Maybe do the same with
>> the parent pointer so as to reduce the number of recursive parameters.
> I was thinking something slightly different. Namely that we should have
> two structs ExprState and ExprBuildState. We can stuff random additions
> to ExprBuildState without concerns about increasing ExprState's state.
Yeah, perhaps --- there's already some existing fields that don't need
to be kept around past the build phase. I haven't done it like that
in the patch I'm working on, but no objections if you want to separate
things into two structs. On the other hand, I doubt it would save
anything, what with palloc's habit of rounding up request sizes.
I have a patch nearly ready to submit, but please clarify your comment
about what you think the callback function signature should be?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | David Fetter | 2017-12-20 17:21:22 | Schema-qualify function calls in information_schema |
Previous Message | Alvaro Herrera | 2017-12-20 17:01:58 | Re: [HACKERS] Proposal: Local indexes for partitioned table |