From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Rethinking the parameter access hooks for plpgsql's benefit |
Date: | 2015-03-10 23:24:40 |
Message-ID: | 25506.1426029880@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On the technical side, I do agree that the requirement to allocate and
> zero an array for every new simple expression is unfortunate, but I'm
> not convinced that repeatedly invoking the hook-function is a good way
> to solve that problem. Calling the hook-function figures to be
> significantly more expensive than an array-fetch, so you can almost
> think of the ParamListInfo entries themselves as a cache of data
> retrieved from the hook function. In that view of the world, the
> problem here is that initializing the cache is too expensive. Your
> solution to that is to rip the cache out completely, but what would be
> better is keep the cache and make initializing it cheaper - e.g. by
> sharing one ParamListInfo across all expressions in the same scope; or
> by not zeroing the memory and instead tracking the the first N slots
> are the only ones we've initialized; or $your_idea_here.
Getting back to the actual merits of this patch --- you're right, it
was not a good idea as proposed. I'd been thinking about the scalar
expression case only, and forgot that the same code is used to pass
parameters to queries, which might evaluate those parameters quite
a large number of times. So any savings from reducing the setup time
is sure to be swamped if the hook-function gets called repeatedly.
Another problem is that for ROW and REC datums, exec_eval_datum()
pallocs memory that won't get freed till the end of the statement,
so repeat evaluation would cause memory leaks. So we really need
evaluate-at-most-once semantics in there.
However, this attempt did confirm that we don't need to create a separate
ParamListInfo struct for each evaluation attempt. So the attached,
greatly stripped-down patch just allocates a ParamListInfo once at
function startup, and then zeroes it each time. This does nothing for
the memset overhead but it does get rid of the palloc traffic, which
seems to be good for a few percent on simple-assignment-type statements.
AFAICS this is an unconditional win compared to HEAD. What's more, it
provides at least a basis for doing better later: if we could think of
a reasonably cheap way of tracking which datums get invalidated by an
assignment, we might be able to reset only selected entries in the array
rather than blindly blowing away all of them.
I propose to apply this and leave it at that for now.
regards, tom lane
Attachment | Content-Type | Size |
---|---|---|
dont-realloc-ParamListInfo.patch | text/x-diff | 10.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2015-03-10 23:45:36 | Re: moving from contrib to bin |
Previous Message | Michael Paquier | 2015-03-10 23:08:36 | Re: Relation ordering in FROM clause causing error related to missing entry... Or not. |