From: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Some notes about Param handling with "Oracle style" plpgsql variables |
Date: | 2009-11-02 19:05:23 |
Message-ID: | 162867790911021105l5f0dd11fp401f5e0e33554eae@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
2009/11/2 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> One of the interesting properties of Oracle-compatible variable
> references in plpgsql is that the set of variables referenced by a
> given query could change during a forced replan. For example,
> consider
>
> declare x int;
> r record;
> ...
> for r in select x,y from tab loop ...
>
> If tab contains a column "x" then the "x" reference in the SELECT
> refers to tab.x; if not, it refers to the plpgsql variable x.
> So when first executing the SELECT we might find that it requires
> a Param reference to the plpgsql variable, and then after a replan
> is forced by ALTER TABLE tab ADD COLUMN x, there is no need for
> the Param anymore. Or vice versa.
>
> This kinda calls into question whether the Oracle way is actually
> a good idea or not; but my purpose here is not to debate that,
> just to look at what it takes to implement it.
This is reason, why I would to see third mode (incompatible with
Oracle or pg), that raise error, when it detects any intersecting
identifiers. I understand so this mode should not be default, but
personally I'll use it everywhere.
Pavel
>
> Currently, plpgsql generates a list of the variables referenced by
> any SQL statement or expression immediately upon seeing the text,
> before it's ever even fed to the core parser. I had been envisioning
> having the parser callback hook construct the list on-the-fly during
> parsing, but the possibility that the list will change from time to
> time means that other changes are needed too. Notably:
>
> 1. plancache.c does not have any provision for letting the Param type
> array associated with a stored statement change when the statement is
> replanned due to SI invalidation.
>
> 2. The control flow for a replan is that plpgsql calls SPI_execute_plan,
> which calls RevalidateCachedPlan, which does the replan if the cached
> plan is discovered to be stale. However, plpgsql already had to set up
> the list of actual parameter values before it called SPI_execute_plan,
> which means it is *way* too late to change the list of required Params
> even if plancache let us do it.
>
> After chewing on these facts for awhile, I am thinking that the best
> solution is for plpgsql to abandon the notion of a predetermined list
> of parameters for a SQL query altogether. What that list basically
> provides is a mapping from Param numbers ($n) to plpgsql "datum numbers"
> (indexes in the list of a plpgsql function's variables). We could make
> that mapping always be one-to-one, since there's no real reason that the
> Params available to a query have to be consecutively numbered. So the
> transformColumnRef hook would just pass back a Param using the
> referenced variable's datum number as paramid; it wouldn't bother at all
> with building a data structure listing the specific variables actually
> used in the query.
>
> As far as plancache goes, it would therefore always see a null array
> of Param type OIDs associated with a plpgsql-generated query, and we'd
> not have to provide a way to update that. (We'd still keep the ability
> to store such an array, because most other callers of plancache will
> still want a fixed list of Params.) What we'd have to add to plancache
> instead is the ability to install caller-determined parser callback
> hooks when it is calling the parser for a replan. This seems fairly
> easy to do --- I'm envisioning a sort of meta-hook function that gets
> called with the new ParseState and can insert hook function pointers
> in it.
>
> The other issue with this is what to do at runtime. We could do it
> with no other changes if we had plpgsql always set up Values/Nulls
> arrays listing *every* datum's current value. This seems a bit
> brute-force though --- it could be slow in a function with a lot of
> variables, and in most cases any specific query or expression would
> not need most of those values. What I think we should do instead
> is extend the ParamListInfo structure to add a callback hook function
> that populates individual ParamExternData array entries on-demand.
> The core executor would call the hook when it tried to fetch the
> value of a Param that was currently invalid (ptype == 0). So the
> hook would be invoked only once per query per referenced parameter,
> which shouldn't be much overhead. Another interesting property
> of this approach is that it'd fix the longstanding user complaint
> that constructions like
> if (TG_OP = 'INSERT' and NEW.foo = 'bar') ...
> fail prematurely. The executor would never demand the value
> of NEW.foo, and thus not fail, if TG_OP isn't INSERT.
>
> Comments?
>
> regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2009-11-02 19:16:15 | Re: Some notes about Param handling with "Oracle style" plpgsql variables |
Previous Message | Magnus Hagander | 2009-11-02 18:41:58 | Re: Renaming conversion procs (was Re: Error on compile for Windows) |