From: | Jan Urbański <wulczer(at)wulczer(dot)org> |
---|---|
To: | Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | pl/python long-lived allocations in datum->dict transformation |
Date: | 2012-02-05 18:54:11 |
Message-ID: | 4F2ED053.1010904@wulczer.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Consider this:
create table arrays as select array[random(), random(), random(),
random(), random(), random()] as a from generate_series(1, 1000000);
create or replace function plpython_outputfunc() returns void as $$
c = plpy.cursor('select a from arrays')
for row in c:
pass
$$ language plpythonu;
When running the function, every datum will get transformed into a
Python dict, which includes calling the type's output function,
resulting in a memory allocation. The memory is allocated in the SPI
context, so it accumulates until the function is finished.
This is annoying for functions that plough through large tables, doing
some calculation. Attached is a patch that does the conversion of
PostgreSQL Datums into Python dict objects in a scratch memory context
that gets reset every time.
Cheers,
Jan
Attachment | Content-Type | Size |
---|---|---|
plpython-tuple-to-dict-leak.patch | text/x-diff | 4.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Jan Urbański | 2012-02-05 19:07:22 | plpgsql leaking memory when stringifying datums |
Previous Message | Jeff Davis | 2012-02-05 18:53:20 | Re: initdb and fsync |