Re: Detection of nested function calls

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Hugo Mercier <hugo(dot)mercier(at)oslandia(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Detection of nested function calls
Date: 2013-10-25 16:44:37
Message-ID: 5586.1382719477@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hugo Mercier <hugo(dot)mercier(at)oslandia(dot)com> writes:
> Le 25/10/2013 17:20, Tom Lane a crit :
>> How do you tell the difference between
>>
>> foo(col1, bar(col2))
>> foo(bar(col1), col2)

> Still not sure to understand ...
> I assume foo() takes two argument of type A.
> bar() can take one argument of A or another type B.

I was assuming everything was the same datatype in this example, ie
col1, col2, and the result of bar() are all type A.

The point I'm trying to make is that in the first case, foo would be
receiving a first argument that was flat and a second that was not flat;
while in the second case, it would be receiving a first argument that was
not flat and a second that was flat. The expression labeling you're
proposing does not help it tell the difference. What's more, you're
proposing that the labeling be made by generic code that can't possibly
know what bar() is really going to do.

> In bar(), you would have the choice to return either a plain A
> or a pointer to A. Because bar() knows its call is nested (by foo()),
> than it can decide to return a pointer to A.

> foo() is then evaluated and we assume it knows A can be a pointer.
> foo() then knows its nesting level of 0 and must return something
> serialized in that case.

Whoa. That's the most fragile, assumption-filled way you could possibly
go about this. In general, bar() cannot be expected to know whether the
outer function is able to take a non-flat parameter value. And you've
glossed over how foo() would know whether its input was flat or not.

Another point here is that there's no good reason to suppose that a
function should return a flattened value just because it's at the outer
level of its syntactic expression. For example, if we're doing a plain
SELECT foo(...) FROM ..., the next thing that will happen with that value
is it'll be fed to the output function for the datatype. Maybe that
output function would like to have a non-flat input value, too, to save
the time of transforming back to that representation. On the other hand,
if it's a SELECT ... ORDER BY ... and the planner chooses to do the ORDER
BY with a final sort step, we'll probably have to flatten the value to
pass it through sorting. (Or possibly not --- perhaps we could just pass
the toast token through sorting?) There are a lot of considerations here
and it's really unreasonable to expect that static expression labeling
will be able to do the right thing every time.

Basically the only way to make this work reliably is for Datums to be
self-identifying as to whether they're flat or structured values; then
make code do the right thing on-the-fly at runtime depending on what kind
of Datum it gets. Once you've done that, I don't see that parse-time
labeling of expression nesting adds anything useful. As Andres said,
the provisions for toasted datums are a good precedent, and none of that
depends on parse-time decisions.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2013-10-25 17:20:00 Re: CLUSTER FREEZE
Previous Message Hugo Mercier 2013-10-25 15:57:08 Re: Detection of nested function calls