From: | "Vyacheslav Kalinin" <vka(at)mgcp(dot)com> |
---|---|
To: | PGSQL <pgsql-general(at)postgresql(dot)org> |
Subject: | Couple of question on functions |
Date: | 2008-05-14 00:39:44 |
Message-ID: | 9b1af80e0805131739i44672f14hf5402d2c52378cf9@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hello,
Reading the manual recently I came across this: (
http://www.postgresql.org/docs/8.3/interactive/xfunc-volatility.html )
> Because of the snapshotting behavior of MVCC (see Chapter 13<http://www.postgresql.org/docs/8.3/interactive/mvcc.html>)
a function containing only SELECT commands can safely be marked
> STABLE, even if it selects from tables that might be undergoing
modifications by concurrent queries. PostgreSQL will execute a STABLE
> function using the snapshot established for the calling query, and so it
will see a fixed view of the database throughout that query. Also
> note that the current_timestamp family of functions qualify as stable,
since their values do not change within a transaction.
It stroke me that it might be not all that safe to mark SELECTing only
function STABLE vs VOLATILE (or vice versa). Consider an example:
create table t1(id int);
create or replace function f1() returns void as
$$
declare
i int;
begin
select count(*) into i from t1;
raise notice '%', i;
-- waste some time
for i in 1..700000000 loop
end loop;
select count(*) into i from t1;
raise notice '%', i;
end;
$$
language 'plpgsql';
Now in first connection do:
select f1();
While the execution is in the loop which takes a while do in another
connection:
insert into t1 values (1);
The function returns with the following notices:
NOTICE: 0
NOTICE: 1
Should I change the volatility type of f1() to STABLE and run the above
again I would get:
NOTICE: 1
NOTICE: 1
It looks like at least plpgsql functions use most recent snapshot on each
call to SPI manager instead that of a calling query, so since default
transaction isolation level in postgres is READ COMMITTED concurrent
transactions may affect result of pure-reader VOLATILE function. I wonder if
any-language (including SQL,and C) function would behave in the same way?
Another thing I've recently discover is that SQL function seem to be
unexpectedly slow to call. Example:
create or replace function f2sql(int) returns int as
$$
select case when $1 < 100000 then 1
when 100000 <= $1 and $1 < 500000 then 2
when $1 >= 500000 then 3
end;
$$
language 'sql' immutable;
create or replace function f2plpgsql(int) returns int as
$$
begin
return case when $1 < 100000 then 1
when 100000 <= $1 and $1 < 500000 then 2
when $1 >= 500000 then 3
end;
end;
$$
language 'plpgsql' immutable;
These two function do exactly the same calculation on input and differ only
in language used. Now I write some query involving them and wrap it into
another function (so that I could use PERFORM to avoid possible overhead on
fetching results to the client, to cache the plan and to measure the time
in more precise manner):
create or replace function f3() returns void as
$$
declare
st timestamp;
begin
st := clock_timestamp();
perform f2sql(trunc(1000000*random())::int) +
f2sql(trunc(1000000*random())::int) +
f2sql(trunc(1000000*random())::int) +
f2sql(trunc(1000000*random())::int) +
f2sql(trunc(1000000*random())::int) +
f2sql(trunc(1000000*random())::int) +
f2sql(trunc(1000000*random())::int) +
f2sql(trunc(1000000*random())::int) +
f2sql(trunc(1000000*random())::int) +
f2sql(trunc(1000000*random())::int)
from generate_series(1, 100000);
raise notice '%', clock_timestamp() - st;
end;
$$
language 'plpgsql' ;
create or replace function f4() returns void as
$$
declare
st timestamp;
begin
st := clock_timestamp();
perform f2plpgsql(trunc(1000000*random())::int) +
f2plpgsql(trunc(1000000*random())::int) +
f2plpgsql(trunc(1000000*random())::int) +
f2plpgsql(trunc(1000000*random())::int) +
f2plpgsql(trunc(1000000*random())::int) +
f2plpgsql(trunc(1000000*random())::int) +
f2plpgsql(trunc(1000000*random())::int) +
f2plpgsql(trunc(1000000*random())::int) +
f2plpgsql(trunc(1000000*random())::int) +
f2plpgsql(trunc(1000000*random())::int)
from generate_series(1, 100000);
raise notice '%', clock_timestamp() - st;
end;
$$
language 'plpgsql' ;
Now f4() reports 4.2 sec of runtime on average while f3() - 10.3 sec, that
is a notable difference especially considering that SQL function is likely
to be inlined. Do i miss something?
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2008-05-14 00:43:28 | Re: Making sure \timing is on |
Previous Message | Nathan Thatcher | 2008-05-13 23:33:33 | Re: Alias in the HAVING clause |