Re: Out of memory error on huge resultset

From: Dave Cramer <Dave(at)micro-automation(dot)net>
To: "Simpson, Mike W" <mike(dot)simpson(at)pbs(dot)proquest(dot)com>
Cc: pgsql-jdbc <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Out of memory error on huge resultset
Date: 2002-10-11 22:18:47
Message-ID: 1034374735.1893.141.camel@inspiron.cramers
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

This is more or less the same problem, you still need to cache rows, and
know where you are in the result set, cursors make it easier because you
can move absolute, or relative in a cursor. Also as it stands now there
is a move end which as it turns out gives you the number of rows in the
resultset.

Just for my elucidation, is there someone working on an implementation?

Nic, you mentioned you would have a shot at it?

Dave
On Fri, 2002-10-11 at 17:49, Simpson, Mike W wrote:
> This definitely isn't my field, but in our code we're perfectly happy
> setting LIMIT and OFFSET. In combination with stored procedures,
> performance is excellent.
>
> If you really want a driver change, wouldn't it be simpler to just append
> LIMIT and OFFSET?
> Anytime setFetchSize is called, just tack on
> query+=" LIMIT "+getFetchSize()+" OFFSET "+rowPointer;
> and rs.next()/cursor code increment the rowPointer.
>
> Just my 2 cents...
> Mike
>
> -----Original Message-----
> From: snpe [mailto:snpe(at)snpe(dot)co(dot)yu]
> Sent: Friday, October 11, 2002 3:43 PM
> To: Dave Cramer
> Cc: pgsql-jdbc; PostgreSQL-development
> Subject: Re: [JDBC] Out of memory error on huge resultset
>
>
> Can You do this :
> We save 1000 (or fetchSize rows) first from beginning
> If table have < 1000 rows we save all rows, but if table have more rows
> and user request 1001 we fetch 1000 (again from begining, but skip 1000 rows
>
> or maybe continue fetching, if it possible)
> When user request last we fetch all rows, but save only last 1000 etc
>
> We save only fetchSize rows and seek from begining when user request
> backward (or maybe seek always when user request out our 'fetchSize' window)
>
> This is slow for large tables, but this is solution until developer get us
>
> better solution from backend.If table have < fetchSize rows this is same
> current solution and we can fix minimal fetchSize for better performance
> with
> small tables.
>
> regards
> Haris Peco
> On Friday 11 October 2002 08:13 pm, Dave Cramer wrote:
> > No,
> >
> > It doesn't have to store them, only display them
> >
> > Dave
> >
> > On Fri, 2002-10-11 at 12:48, snpe wrote:
> > > Hello,
> > > Does it mean that psql uses cursors ?
> > >
> > > regards
> > > Haris Peco
> > >
> > > On Friday 11 October 2002 05:58 pm, Dave Cramer wrote:
> > > > This really is an artifact of the way that postgres gives us the data.
> > > >
> > > > When you query the backend you get *all* of the results in the query,
> > > > and there is no indication of how many results you are going to get.
> In
> > > > simple selects it would be possible to get some idea by using
> > > > count(field), but this wouldn't work nearly enough times to make it
> > > > useful. So that leaves us with using cursors, which still won't tell
> > > > you how many rows you are getting back, but at least you won't have
> the
> > > > memory problems.
> > > >
> > > > This approach is far from trivial which is why it hasn't been
> > > > implemented as of yet, keep in mind that result sets support things
> > > > like move(n), first(), last(), the last of which will be the
> trickiest.
> > > > Not to mention updateable result sets.
> > > >
> > > > As it turns out there is a mechanism to get to the end move 0 in
> > > > 'cursor', which currently is being considered a bug.
> > > >
> > > > Dave
> > > >
> > > > On Fri, 2002-10-11 at 11:44, Doug Fields wrote:
> > > > > At 08:27 AM 10/11/2002, snpe wrote:
> > > > > >Barry,
> > > > > > Is it true ?
> > > > > >I create table with one column varchar(500) and enter 1 milion rows
> > > > > > with length 10-20 character.JDBC query 'select * from a' get error
> > > > > > 'out of memory', but psql not.
> > > > > >I insert 8 milion rows and psql work fine yet (slow, but work)
> > > > >
> > > > > The way the code works in JDBC is, in my opinion, a little poor but
> > > > > possibly mandated by JDBC design specs.
> > > > >
> > > > > It reads the entire result set from the database backend and caches
> > > > > it in a horrible Vector (which should really be a List and which
> > > > > should at least make an attempt to get the # of rows ahead of time
> to
> > > > > avoid all the resizing problems).
> > > > >
> > > > > Then, it doles it out from memory as you go through the ResultSet
> > > > > with the next() method.
> > > > >
> > > > > I would have hoped (but was wrong) that it streamed - WITHOUT
> LOADING
> > > > > THE WHOLE THING - through the result set as each row is returned
> from
> > > > > the backend, thus ensuring that you never use much more memory than
> > > > > one line. EVEN IF you have to keep the connection locked.
> > > > >
> > > > > The latter is what I expected it to do. The former is what it does.
> > > > > So, it necessitates you creating EVERY SELECT query which you think
> > > > > has more than a few rows (or which you think COULD have more than a
> > > > > few rows, "few" being defined by our VM memory limits) into a cursor
> > > > > based query. Really klugy. I intend to write a class to do that for
> > > > > every SELECT query for me automatically.
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Doug
> > > > >
> > > > > >In C library is 'execute query' without fetch - in jdbc execute
> > > > > > fetch all rows
> > > > > >and this is problem - I think that executequery must prepare query
> > > > > > and fetch (ResultSet.next or ...) must fetch only fetchSize rows.
> I
> > > > > > am not sure, but I think that is problem with jdbc, not postgresql
> > > > > > Hackers ?
> > > > > >Does psql fetch all rows and if not how many ?
> > > > > >Can I change fetch size in psql ?
> > > > > >CURSOR , FETCH and MOVE isn't solution.
> > > > > >If I use jdbc in third-party IDE, I can't force this solution
> > > > > >
> > > > > >regards
> > > > > >
> > > > > >On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:
> > > > > > > Nick,
> > > > > > >
> > > > > > > This has been discussed before on this list many times. But the
> > > > > > > short answer is that that is how the postgres server handles
> > > > > > > queries. If you issue a query the server will return the entire
> > > > > > > result. (try the same query in psql and you will have the same
> > > > > > > problem). To work around this you can use explicit cursors (see
> > > > > > > the DECLARE CURSOR, FETCH, and MOVE sql commands for postgres).
> > > > > > >
> > > > > > > thanks,
> > > > > > > --Barry
> > > > > > >
> > > > > > > Nick Fankhauser wrote:
> > > > > > > > I'm selecting a huge ResultSet from our database- about one
> > > > > > > > million rows, with one of the fields being varchar(500). I get
> > > > > > > > an out of memory error from java.
> > > > > > > >
> > > > > > > > If the whole ResultSet gets stashed in memory, this isn't
> > > > > > > > really surprising, but I'm wondering why this happens (if it
> > > > > > > > does), rather than a subset around the current record being
> > > > > > > > cached and other rows being retrieved as needed.
> > > > > > > >
> > > > > > > > If it turns out that there are good reasons for it to all be
> in
> > > > > > > > memory, then my question is whether there is a better approach
> > > > > > > > that people typically use in this situation. For now, I'm
> > > > > > > > simply breaking up the select into smaller chunks, but that
> > > > > > > > approach won't be satisfactory in the long run.
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > > -Nick
> > > > > > > >
> > > > > > > >
> ---------------------------------------------------------------
> > > > > > > >---- ------ - Nick Fankhauser nickf(at)ontko(dot)com Phone
> > > > > > > > 1.765.935.4283 Fax 1.765.962.9788 Ray Ontko & Co.
> Software
> > > > > > > > Consulting Services http://www.ontko.com/
> > > > > > > >
> > > > > > > >
> > > > > > > > ---------------------------(end of
> > > > > > > > broadcast)--------------------------- TIP 5: Have you checked
> > > > > > > > our extensive FAQ?
> > > > > > > >
> > > > > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > > > > >
> > > > > > > ---------------------------(end of
> > > > > > > broadcast)--------------------------- TIP 6: Have you searched
> > > > > > > our list archives?
> > > > > > >
> > > > > > > http://archives.postgresql.org
> > > > > >
> > > > > >---------------------------(end of
> > > > > > broadcast)--------------------------- TIP 2: you can get off all
> > > > > > lists at once with the unregister command (send "unregister
> > > > > > YourEmailAddressHere" to
> > > > > > majordomo(at)postgresql(dot)org)
> > > > >
> > > > > ---------------------------(end of
> > > > > broadcast)--------------------------- TIP 6: Have you searched our
> > > > > list archives?
> > > > >
> > > > > http://archives.postgresql.org
> > > >
> > > > ---------------------------(end of
> > > > broadcast)--------------------------- TIP 1: subscribe and unsubscribe
> > > > commands go to majordomo(at)postgresql(dot)org
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 5: Have you checked our extensive FAQ?
> > >
> > > http://www.postgresql.org/users-lounge/docs/faq.html
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 5: Have you checked our extensive FAQ?
> >
> > http://www.postgresql.org/users-lounge/docs/faq.html
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
>
>

In response to

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Dave Cramer 2002-10-11 22:24:56 Cursor based result set
Previous Message Vincent Janelle 2002-10-11 21:53:19 Re: Out of memory error on huge resultset