Re: problem with lost connection while running long PL/R query

From: "David M(dot) Kaplan" <david(dot)kaplan(at)ird(dot)fr>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ryan Kelly <rpkelly22(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: problem with lost connection while running long PL/R query
Date: 2013-05-17 09:09:35
Message-ID: 5195F3CF.6020800@ird.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

Of course today after a reboot to update to the newest kernel,
everything works without crashing...

I imagine that yesterday the problem was that I had forgotten that I had
a Windows virtual machine running on the server that was eating a good
piece of memory. Still, using a cursor to page through the data should
have avoided major memory difficulties...

As a footnote, I can confirm that PL/R was not eating up memory little
by little (see results of memory.profile() at beginning and end of each
function invokation below).

Thanks,
David

Run 1:

NOTICE: Starting predict_classification_model
NOTICE: NULL symbol pairlist closure environment
promise
NOTICE: 1 7847 173768 4308 1129 6273
NOTICE: language special builtin char logical
integer
NOTICE: 46996 189 1839 9387 7351 17229
NOTICE: double complex character ... any list
NOTICE: 2190 1 42810 1 0 13456
NOTICE: expression bytecode externalptr weakref raw S4
NOTICE: 3 10246 823 198 199 1061

NOTICE: Ending predict_classification_model
NOTICE: NULL symbol pairlist closure environment
promise
NOTICE: 1 8062 181354 4360 1129 6273
NOTICE: language special builtin char logical
integer
NOTICE: 50616 189 1839 2085318 7496 17972
NOTICE: double complex character ... any list
NOTICE: 2393 1 43459 1 0 13929
NOTICE: expression bytecode externalptr weakref raw S4
NOTICE: 3 10683 814 187 188 1061

Run 2:

NOTICE: Starting predict_classification_model
NOTICE: NULL symbol pairlist closure environment
promise
NOTICE: 1 8062 180936 4360 1128 6271
NOTICE: language special builtin char logical
integer
NOTICE: 50536 189 1839 9684 7496 17878
NOTICE: double complex character ... any list
NOTICE: 2258 1 43312 1 0 13899
NOTICE: expression bytecode externalptr weakref raw S4
NOTICE: 3 10683 813 188 189 1061

NOTICE: Ending predict_classification_model
NOTICE: NULL symbol pairlist closure environment
promise
NOTICE: 1 8062 181356 4360 1129 6273
NOTICE: language special builtin char logical
integer
NOTICE: 50617 189 1839 2085317 7496 17972
NOTICE: double complex character ... any list
NOTICE: 2393 1 43459 1 0 13929
NOTICE: expression bytecode externalptr weakref raw S4
NOTICE: 3 10683 814 187 188 1061

Run 3:

NOTICE: Starting predict_classification_model
NOTICE: NULL symbol pairlist closure environment
promise
NOTICE: 1 8062 180936 4360 1128 6271
NOTICE: language special builtin char logical
integer
NOTICE: 50536 189 1839 9684 7496 17878
NOTICE: double complex character ... any list
NOTICE: 2258 1 43312 1 0 13899
NOTICE: expression bytecode externalptr weakref raw S4
NOTICE: 3 10683 813 188 189 1061

NOTICE: Ending predict_classification_model
NOTICE: NULL symbol pairlist closure environment
promise
NOTICE: 1 8062 181356 4360 1129 6273
NOTICE: language special builtin char logical
integer
NOTICE: 50617 189 1839 2085319 7496 17972
NOTICE: double complex character ... any list
NOTICE: 2393 1 43459 1 0 13929
NOTICE: expression bytecode externalptr weakref raw S4
NOTICE: 3 10683 814 187 188 1061

On 05/16/2013 11:40 PM, Joe Conway wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 05/16/2013 08:40 AM, Tom Lane wrote:
>> "David M. Kaplan" <david(dot)kaplan(at)ird(dot)fr> writes:
>>> Thanks for the help. You have definitely identified the problem,
>>> but I am still looking for a solution that works for me. I tried
>>> setting vm.overcommit_memory=2, but this just made the query
>>> crash quicker than before, though without killing the entire
>>> connection to the database. I imagine that this means that I
>>> really am trying to use more memory than the system can handle?
>>> I am wondering if there is a way to tell postgresql to flush a
>>> set of table lines out to disk so that the memory they are using
>>> can be liberated.
>> Assuming you don't have work_mem set to something unreasonably
>> large, it seems likely that the excessive memory consumption is
>> inside your PL/R function, and not the fault of Postgres per se.
>> You might try asking in some R-related forums about how to reduce
>> the code's memory usage.
> The two "classic" approaches to this with PL/R are either create a
> custom aggregate with the PL/R as the final function (i.e. work on one
> group at a time) or use the SPI cursor functionality within the PL/R
> function and page your data using a cursor. Not all forms of analysis
> lend themselves to these approaches, but perhaps yours does.
>
> Ultimately I would like to implement a form of R data.frame that does
> the paging with a cursor transparently for you, but I have not been
> able to find the time so far.
>
> Joe
>
>
> - --
> Joe Conway
> credativ LLC: http://www.credativ.us
> Linux, PostgreSQL, and general Open Source
> Training, Service, Consulting, & 24x7 Support
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
> iQIcBAEBAgAGBQJRlVJdAAoJEDfy90M199hle8gP+wU+GSJ44g26VBBAy3po/E/Y
> 9+pwxBhJe0x6v5PXtuM8Bzyy4yjlKCgzDj4XdZpEU7SYR+IKj7tWCihqc+Fuk1t1
> EjR2VUJwpSMztRvEIqWW8rX/DFGaVYCt89n0neKfKL/XJ5rbqMqQAUPbxMaBtW/p
> 7EXo8RjVBMYibkvKrjpYJjLTuOTWkQCiXx5hc4HVFN53DYOF46rdFxMYUe5KLYTL
> mZOnSoV0yrsaPGnxRIY0uzRv7ZTTBmB2o4TIWpTySx2rHNLqAJIT22wl0pfkjksH
> JYvko3rWhSg7vSf+8RDN6X1eMAXcUO7H2NR5IdOoXEX2bzqTmDBQUjOcb5WR1yUd
> L5XuT5WYiTpyzU8qAtPEVirwFnEwUN1tR6wDoVsseIWwXUYqSuXtg9qjFNAXZ1Hr
> 05yxuzexOEzLQNwSXWhsCrLdnndEHrJ6pDlLaUCPVybxwwwW9BfS2fJUz+X63M8x
> l5DYbyl6q6o2J2bs4UGCTk4r/1Qq/R9pApkWzsckTtF6zl49mzwzPnh5b/JcB+4x
> u17Te+s3cRGcX09lt7qf9cWkv1uUF/Qw0ntBhW8TY2HYhbWVIEmiZV1HIksXf+nw
> EBFshWs2/H75OPnhN9YNq3tjCuiR7o/eaZeINfGs2LzGIJvHpcjMDBgFFTES7CYV
> Y20XukH07h9XcJGTsf0o
> =TwfD
> -----END PGP SIGNATURE-----

--
**********************************
David M. Kaplan
Charge de Recherche 1

Institut de Recherche pour le Developpement
Centre de Recherche Halieutique Mediterraneenne et Tropicale
av. Jean Monnet
B.P. 171
34203 Sete cedex
France

Phone: +33 (0)4 99 57 32 27
Fax: +33 (0)4 99 57 32 95

http://www.umr-eme.org/team/dkaplan/
http://www.amped.ird.fr/
**********************************

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Cédric Villemain 2013-05-17 11:04:57 Re: [HACKERS] PLJava for Postgres 9.2.
Previous Message sumita 2013-05-17 08:56:31 Re: FATAL: database "a/system_data" does not exist