Re: Correct use of cursors for very large result sets in Postgres

From: Mike Beaton <mjsbeaton(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Correct use of cursors for very large result sets in Postgres
Date: 2017-02-19 07:54:18
Message-ID: CAHzAAWTRCdbNkQqfS74jAtoT8ThBnQAuDm+=Uw-JOfgADQ3jtQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> Seems odd. Is your cursor just on "SELECT * FROM table", or is there
> some processing in there you're not mentioning? Maybe it's a cursor
> WITH HOLD and you're exiting the source transaction?

Hi Tom,

I've deleted my own Stack Overflow answer in favour of Laurenz Albe's one.

New TL;DR (I'm afraid): PostgreSQL is always generating a huge buffer file
on `FETCH ALL FROM CursorToHuge`.

The test data is created by:

`SELECT * INTO large FROM generate_series(1, 10000000) id;`

The test function to generate the cursor is:

````
CREATE OR REPLACE FUNCTION lump() RETURNS refcursor
LANGUAGE plpgsql AS
$$DECLARE
c CURSOR FOR SELECT id FROM large;
BEGIN
c := 'c';
OPEN c;
RETURN c;
END;$$;
````

The two tests are:

`SELECT * FROM large;`

Result: no buffer file.

And:

````
BEGIN;
SELECT lump();
FETCH ALL FROM c;
COMMIT;
````

Result: 14MB buffer, every time.

The buffer file appears in `base\pgsql_tmp` while the data is streaming but
only appears in the Postgres log file at the point when it is released
(which makes sense, as its final size is part of the log row).

This has the additionally confusing result that the buffer file is reported
in the Postgres logs just before the user sees the first row of data on
`psql` (and on anything using `libpq`), but just after the user sees the
last row of data, on any client program which is streaming the data via a
streaming data access layer (such as `Npgsql`, or `JDBC` with the right
configuration).

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Diego Vargas 2017-02-20 21:39:52 Query Performance
Previous Message Tom Lane 2017-02-18 17:43:49 Re: Correct use of cursors for very large result sets in Postgres