Retrieving ResultSets with millions of rows - OutOfMemoryError

From: Doug Fields <dfields-postgres(at)pexicom(dot)com>
To: pgsql-jdbc(at)postgresql(dot)org
Subject: Retrieving ResultSets with millions of rows - OutOfMemoryError
Date: 2002-09-13 21:43:49
Message-ID: 5.1.0.14.2.20020913173522.01f25020@pop.pexicom.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

Hello,

I've just come to the startling realization that the 7.2 JDBC code loads
every single row in a ResultSet into a Vector. (core/QueryExecutor.java)

Unfortunately, I run some queries which return millions of rows, each row
which could well be 100-1000 bytes and more.

Hence, I get an OutOfMemoryError.

For some queries, there's an obvious workaround: issue the query with an
appropriate suffix "ORDER BY <something> LIMIT <something> OFFSET
<something>" several times in succession. This will, of course, work mostly
fine assuming it's a simple single-table query with an appropriate index
(such as a primary key).

However, some of my queries are complex one-to-many joins with no
particular ordering (or no index for quick ordering). These would be much
harder to do that trick with, and/or incur extreme amounts of database
overhead in running the query hundreds of times (for example, if I were to
handle 10,000 rows at a time).

Is there any way to get a ResultSet which is actually live streamed, which
keeps no knowledge of previous rows or the row count? It seems utterly
wasteful to me that I should need to load millions of rows into a Vector
(try an ArrayList next time, for minor speedups) when all I want to do is
stream the results, not ever needing to know the total number of rows ahead
of time nor any previous (or subsequent) rows to the "current one" from
ResultSet.next().

I can't imagine I'm the first person to be using JDBC to access tables with
tens of millions of rows to attempt to access significant portions of them
at a time.

Thanks,

Doug

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Dave Cramer 2002-09-14 00:28:43 Re: Retrieving ResultSets with millions of rows -
Previous Message Daniel Serodio 2002-09-13 17:54:11 Bug in PreparedStatement and 'numeric' columns