Re: dubious optimization of the function in SELECT INTO target list

From: Oleksii Kliukin <alexk(at)hintbits(dot)com>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: dubious optimization of the function in SELECT INTO target list
Date: 2015-10-08 08:43:43
Message-ID: 27E185D4-3BC0-4C85-8605-857FB5BA673F@hintbits.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


> On 06 Oct 2015, at 23:11, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> wrote:
>
> On 10/06/2015 02:00 PM, Oleksii Kliukin wrote:
>>
>>> On 06 Oct 2015, at 22:50, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com
>>> <mailto:adrian(dot)klaver(at)aklaver(dot)com>> wrote:
>>>
>>> On 10/06/2015 01:48 PM, Oleksii Kliukin wrote:
>>>>
>>>>> On 06 Oct 2015, at 22:40, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com
>>>>> <mailto:adrian(dot)klaver(at)aklaver(dot)com>
>>>>> <mailto:adrian(dot)klaver(at)aklaver(dot)com>> wrote:
>>>>>
>>>>> On 10/06/2015 01:13 PM, Oleksii Kliukin wrote:
>>>>>>
>>>>>> Basically, if we invoke the first example, the foo table with have only
>>>>>> 1 row and not 10, as supplied by the generate_series.
>>>>>> However, when ORDER BY is attached to the query, or aggregate (such as
>>>>>> max, min or array_agg) is wrapped around the test(id) call, the test
>>>>>> function is called exactly 10 times. If I replace the SELECT INTO with
>>>>>> PERFORM, it would also be called 10 times. Unfortunately, it is not
>>>>>> possible to use PERFORM directly in the CTE expression.
>>>>>
>>>>> What CTE expression?
>>>>
>>>> Any CTE expression :-). The example here is just an illustration to
>>>> expose the issue. The real-world query I came across used a complex CTE
>>>> expression and called a function at the end of it inside the SELECT INTO
>>>> statement.
>>>
>>> Remember SELECT INTO inside plpgsql is different from SELECT INTO outside:
>>>
>>> http://www.postgresql.org/docs/9.4/interactive/plpgsql-statements.html#PLPGSQL-STATEMENTS-SQL-ONEROW
>>>
>>> "Tip: Note that this interpretation of SELECT with INTO is quite
>>> different from PostgreSQL's regular SELECT INTO command, wherein the
>>> INTO target is a newly created table. If you want to create a table
>>> from a SELECT result inside a PL/pgSQL function, use the syntax CREATE
>>> TABLE ... AS SELECT.
>>
>> Thank you. In this case SELECT INTO was consciously called inside the
>> pl/pgSQL function, as one cannot do SELECT function() from pl/pgSQL
>> without storing the result of the function somewhere (with the INTO clause).
>
> So what you asking is why to replicate this:
>
> DECLARE l_id integer;
> BEGIN
> PERFORM test(id)
> FROM generate_series(1,10) as id ;
> END;
> $$ LANGUAGE plpgsql;
>
> you have to do something like this?:
>
> DO $$
> DECLARE l_id integer;
> BEGIN
> SELECT test(id) INTO l_id
> FROM generate_series(1,10) AS id order by id;
> END;
> $$ LANGUAGE plpgsql;
> DO

My question was, essentially, if SELECT INTO in pl/pgSQL is supposed to stop after emitting the first row, ignoring the fact that the expression it calls may have side effects. I think I’ve got the answer from Tom that yes, it is supposed to be so, but I still think the docs are quite ambiguous about it (i.e. I read "Any result rows after the first row are discarded.” in the SELECT INTO description as a possible sign that they are still evaluated).

Kind regards,
--
Oleksii

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Oleksii Kliukin 2015-10-08 08:57:41 Re: dubious optimization of the function in SELECT INTO target list
Previous Message hari.fuchs 2015-10-08 07:49:10 Re: Best practices for aggregate table design