My tests show that a WITH HOLD cursor has subtly different semantics from a WITHOUT HOLD cursor

From: Bryn Llewellyn <bryn(at)yugabyte(dot)com>
To: pgsql-general list <pgsql-general(at)lists(dot)postgresql(dot)org>
Cc: laurenz(dot)albe(at)cybertec(dot)at
Subject: My tests show that a WITH HOLD cursor has subtly different semantics from a WITHOUT HOLD cursor
Date: 2023-03-31 19:35:08
Message-ID: 76A5F3E1-DB5F-4DA7-AF29-687D511E3948@yugabyte.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

*Summary*

My tests show that, when a WITHOUT HOLD cursor has to cache results (see Note 1), then the WHERE clause (if present) is stripped off the cursor's defining SELECT statement and the entire unrestricted result set is cached. But when a WITH HOLD cursor is used, then it’s the *restricted* result set that’s cached.

I do see that this wouldn't have a detectable effect when the cursor's defining query doesn't involve any volatile functions. But it does seem that too much data is cached in the "not holdable" case—and this seems to be a bad thing for space use and for speed.

I'd like to know the rationale for this design choice but I've been unable to find any mention of it in the PG doc (see Note 2).

My test design used this statement:

select v from rndm_series() [where rndm_filter()]

And I used a trivial table function "cursor_rows()” that simply iterated over all the rows that the cursor defined, from first through last, to display the cursor's result set three times in succession.

This lead to four runs: "WITH or WITHOUT HOLD" by "with or without the WHERE clause".

I might just as well have used a simple "series()" function wrapped around "generate_series()" that used "raise info" to report that it was called together with a simple "filter()" function that always returned true and, again, that used "raise info" to report that it was called. But I find it too hard to divert "raise info" output to a spool file and have it interleave properly with regular SQL output. So I used a volatile "rndm_series() that always returned ten rows but (using the random() built-in function) produced a different set of values on each call. And I used a volatile "rndm_filter()" function that similarly randomly returned TRUE or FALSE. These devices dramatized the effect I'm that reporting here and made it simple to record the results ordinarily with the \o meta-command.

I'm assuming that my PL/pgSQL functions are opaque to the planner and so that it cannot run the execution plan in backwards order and must, therefore, cash the cursor's result set.

I would have expected the *restricted* result set to be cached for both flavors of cursor—holdable and not. After all, we're taught to expect nonsense results when a volatile function's results are cached—so a cursor should be no different w.r.t. this thinking.

(Of course, I've read the CAUTION in the PG doc for the DECLARE statement that says « Scrollable cursors may give unexpected results if they invoke any volatile functions » and I wouldn't use a volatile function in real life.)
——————————————————————————————

*Note 1*

I read Laurenz's blogpost "WITH HOLD cursors and transactions in PostgreSQL" (www.cybertec-postgresql.com/en/with-hold-cursors-and-transactions-in-postgresql/) and I noted this:

«
[Some] execution plans [for a WITHOUT HOLD cursor] require the explicit keyword SCROLL for the cursor to become scrollable. Such cursors incur an overhead, because the server must cache the entire result set.
»

and then this:

«
PostgreSQL must calculate the complete result set [for a WITH HOLD cursor] at COMMIT time and cache it on the server.
»

In other words, both a WITH HOLD cursor (always) and a WITHOUT HOLD cursor (sometimes) must cache the entire result set.
——————————————————————————————

*Note 2*

Laurenz also wrote this in the "Is the PL/pgSQL refcursor useful in a modern three-tier app?" thread that I started:

«
I understand your confusion, and I believe that the documentation could be improved... I personally find that reading the PostgreSQL documentation gets you far, but only so far: for deep understanding, you have to read the code... I have come to see it as an extension of the documentation that covers the details.
»
——————————————————————————————

*Testcase*

I used PG Version 15.2. The code that follows is self-contained. You need just to connect as an ordinary user to a database where it has the CREATE privilege. Copy it into, say, "t.sql" and start it in psql. (Make sure that you have the usual « AUTOCOMMIT = 'on' » setting.)

\c :db :u1

drop schema if exists s1 cascade;
create schema s1;
set search_path = s1, pg_catalog, pg_temp;

create procedure init_rndm()
set search_path = pg_catalog, pg_temp
language plpgsql
as $body$
begin
perform setseed(0.0::double precision);
end;
$body$;

create function rndm_filter()
returns boolean
set search_path = pg_catalog, pg_temp
language sql
as $body$
select (random() > 0.7::float8);
$body$;

-- Ensure that the planner cannot know how the rows are delivered.
create function rndm_series()
returns table(v int)
set search_path = s1, pg_catalog, pg_temp
language plpgsql
as $body$
declare
val int not null := 0;
ctr int not null := 0;
begin
loop
val := val + 1;
if (random() > 0.9::float8) then
ctr := ctr + 1;
exit when ctr > 10;
v := val; return next;
end if;
end loop;
end;
$body$;

create procedure open_holdable_cursor(holdable in boolean, filter_series in boolean, caption inout text)
language plpgsql
as $body$
declare
cur constant refcursor not null := 'cur';

without_filter constant text not null :=
'declare cur scroll cursor without hold for '||
'select v from rndm_series()';

with_filter constant text not null :=
without_filter||' where rndm_filter()';

without_filter_holdable constant text not null :=
replace(without_filter, 'without', 'with');

with_filter_holdable constant text not null :=
replace(with_filter, 'without', 'with');
begin
begin
close cur;
exception when invalid_cursor_name then
null;
end;

case holdable
when false then
case filter_series
when false then execute without_filter;
when true then execute with_filter;
end case;

when true then
case filter_series
when false then execute without_filter_holdable;
when true then execute with_filter_holdable;
end case;
end case;

declare
holdable text not null := '';
stmt text not null := '';
begin
select
case is_holdable
when true then 'with hold'
when false then 'without hold'
end,
statement
into holdable, stmt
from pg_cursors where name = 'cur';

stmt := replace(stmt, ' without hold', '');
stmt := replace(stmt, ' with hold', '');
stmt := replace(stmt, 'declare cur scroll cursor for ', '');
caption := rpad(holdable||':', 14)||stmt;
end;
end;
$body$;

create function cursor_rows()
returns table(z int)
set search_path = s1, pg_catalog, pg_temp
language plpgsql
as $body$
declare
cur constant refcursor not null := 'cur';
begin
move absolute 0 in cur;
loop
fetch next from cur into z;
exit when not found;
return next;
end loop;
end;
$body$;
--------------------------------------------------------------------------------

\t on
\o spool.txt

start transaction;
call init_rndm();
call open_holdable_cursor(false, false, '');
select z from cursor_rows();
select z from cursor_rows();
select z from cursor_rows();
rollback;

start transaction;
call init_rndm();
call open_holdable_cursor(false, true, '');
select z from cursor_rows();
select z from cursor_rows();
select z from cursor_rows();
rollback;

call init_rndm();
call open_holdable_cursor(true, false, '');
select z from cursor_rows();
select z from cursor_rows();
select z from cursor_rows();

call init_rndm();
call open_holdable_cursor(true, true, '');
select z from cursor_rows();
select z from cursor_rows();
select z from cursor_rows();

\o
\t off

--------------------------------------------------------------------------------

Here's what my "spool.txt" contained:

without hold: select v from rndm_series()

9
34
35
39
40
47
91
101
136
137

9
34
35
39
40
47
91
101
136
137

9
34
35
39
40
47
91
101
136
137

without hold: select v from rndm_series() where rndm_filter()

9
91

9
34
39
47
91

35
136
137

with hold: select v from rndm_series()

9
34
35
39
40
47
91
101
136
137

9
34
35
39
40
47
91
101
136
137

9
34
35
39
40
47
91
101
136
137

with hold: select v from rndm_series() where rndm_filter()

9
91

9
91

9
91

--------------------------------------------------------------------------------

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David G. Johnston 2023-03-31 20:36:12 Re: My tests show that a WITH HOLD cursor has subtly different semantics from a WITHOUT HOLD cursor
Previous Message Siddharth Jain 2023-03-31 17:35:21 Re: Question on creating keys on partitioned tables