Re: Correct use of cursors for very large result sets in Postgres

From: Mike Beaton <mjsbeaton(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Correct use of cursors for very large result sets in Postgres
Date: 2017-02-21 13:49:09
Message-ID: CAHzAAWSKGo16vRAsaYp=8wM5E=GfGU18o7FH+XryN9g4S4C3oQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Thanks, Tom.

Wouldn't this mean that cursors are noticeably non-optimal even for normal
data sizes, since the entire data to be streamed from the table is always
duplicated into another buffer and then streamed?

> if you want the whole query result at once, why are you bothering with a
cursor?

The PostgreSQL docs (
https://www.postgresql.org/docs/9.6/static/plpgsql-cursors.html#AEN66551)
clearly
recommend cursors as a way to return a reference to a large result set from
a function (as I understood, this is recommended precisely as a way to
avoid tuple-based buffering of the data).

So following that advice, it's not unreasonable that I would actually have
a cursor to a large dataset.

Then, I would ideally want to be able to fetch the data from that cursor
without the entire data getting duplicated (even if only a bit at a time
instead of all at once, which seems to be the best case behaviour) as I go.

Additionally, I thought that if I had a streaming use-case (which I do),
and a streaming data-access layer (which I do), then since `SELECT * FROM
large` is absolutely fine, end-to-end, in that situation, then by symmetry
and the principle of least astonishment `FETCH ALL FROM cursor` might be
fine too.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Pietro Pugni 2017-02-21 13:49:39 Suggestions for a HBA controller (6 x SSDs + madam RAID10)
Previous Message Tom Lane 2017-02-21 13:32:09 Re: Correct use of cursors for very large result sets in Postgres