expensive function in select list vs limit clause

From: Chris Mair <chris(at)1006(dot)org>
To: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: expensive function in select list vs limit clause
Date: 2017-04-05 08:01:52
Message-ID: 48d635e9c0f5f78430c51fb1208b1fad@smtp.hushmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

I've found a (simple) situation where the planner does something I don't understand.

Below is a complete test case followed by output.

From the timings it appears that in the second explain analyze query a function
call in the select list (expensive()) is evaluated in the sequential scan node
*for each* row in big, despite the use of limit.

I would have expected expensive() to be evaluated only for the ten rows
in the result set. Hence the second explain analyze query shouldn't be more
expensive than the first one.

My trust in Postgres' planner goes so far as I feel the planner is right and there
must be a reason for this :)

Could someone help me understand this behaviour?

Thanks & Bye,
Chris.

-- ***************************************************************************

select version();

-- setup: create a time wasting function and a table with 1M rows

create function expensive() returns double precision as
$$
begin
for i in 1 .. 15000 loop
end loop;
return random();
end;
$$ language 'plpgsql';

create table big as select random() as r from generate_series(1, 1000000);
analyze big;

\timing on

-- benchmark expensive(): one call to expensive takes about 0.3 ms => OK

do $$ begin for i in 1 .. 1000 loop perform expensive(); end loop; end; $$;

-- find the ten smallest values in big: takes ~ 0.18s => OK

explain analyze select r from big order by r offset 0 limit 10;

-- now do the same, but add an expensive() column to the result:
-- takes ~ 29s => WHY?

explain analyze select r, expensive() from big order by r offset 0 limit 10;

-- clean up :)

\timing off
drop function expensive();
drop table big;

-- ***************************************************************************

version
--------------------------------------------------------------------------------------------------------------
PostgreSQL 9.5.4 on x86_64-apple-darwin14.5.0, compiled by Apple LLVM version 7.0.2 (clang-700.1.81), 64-bit
(1 row)

Time: 0.386 ms
CREATE FUNCTION
Time: 0.456 ms
SELECT 1000000
Time: 466.814 ms
ANALYZE
Time: 73.770 ms
Timing is on.
DO
Time: 33.922 ms
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Limit (cost=36034.64..36034.67 rows=10 width=8) (actual time=182.361..182.363 rows=10 loops=1)
-> Sort (cost=36034.64..38534.64 rows=1000000 width=8) (actual time=182.360..182.361 rows=10 loops=1)
Sort Key: r
Sort Method: top-N heapsort Memory: 25kB
-> Seq Scan on big (cost=0.00..14425.00 rows=1000000 width=8) (actual time=0.022..99.777 rows=1000000 loops=1)
Planning time: 0.070 ms
Execution time: 182.377 ms
(7 rows)

Time: 182.689 ms
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Limit (cost=286034.64..286034.67 rows=10 width=8) (actual time=28932.311..28932.314 rows=10 loops=1)
-> Sort (cost=286034.64..288534.64 rows=1000000 width=8) (actual time=28932.309..28932.310 rows=10 loops=1)
Sort Key: r
Sort Method: top-N heapsort Memory: 25kB
-> Seq Scan on big (cost=0.00..264425.00 rows=1000000 width=8) (actual time=0.062..28822.520 rows=1000000 loops=1)
Planning time: 0.038 ms
Execution time: 28932.339 ms
(7 rows)

Time: 28932.908 ms
Timing is off.
DROP FUNCTION
DROP TABLE

-- ***************************************************************************

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Pavel Stehule 2017-04-05 08:06:25 Re: Query never completes with low work_mem (at least not within one hour)
Previous Message Günce Kaya 2017-04-05 07:44:37 calculating table and index size