Re: Understanding memory usage

From: Daniele Varrazzo <daniele(dot)varrazzo(at)gmail(dot)com>
To: Damiano Albani <damiano(dot)albani(at)gmail(dot)com>
Cc: psycopg <psycopg(at)postgresql(dot)org>
Subject: Re: Understanding memory usage
Date: 2013-10-30 18:27:15
Message-ID: CA+mi_8ZVuL1+igjsO1wOTK9vvZ7C4MBSrRcgAOQ0KnZf2gPMKA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: psycopg

On Wed, Oct 30, 2013 at 5:24 PM, Damiano Albani <damiano(dot)albani(at)gmail(dot)com>wrote:

> Hello,
>
>
> On Tue, Oct 29, 2013 at 12:23 AM, Daniele Varrazzo <
> daniele(dot)varrazzo(at)gmail(dot)com> wrote:
>
>>
>> Because the result is returned to the client as the response for the
>> query and is stored inside the cursor. fetch*() only return it to
>> Python.
>>
>
> So why does calling "fetch*()" uses *additional* memory then? Does it
> copy the data returned from the database?
>

Data in the cursor is stored in the form of a PQresult structure, which is
an opaque object for which the libpq provides access function.

Such data is converted into Python objects when fetch*() is used. This
usually implies a copy, because e.g. Python strings own their data, but
even returning numbers to Python generally implies creating new instances.

By the way,* *I've re-run my tests but focused on the Vm*RSS* metric, which
> represents how much actual physical memory is used by the process.
>
> And I got the same behavior, that is almost no memory is reclaimed after
> having fetched a *large* number of rows.
> For instance, if I fetch 2 millions small rows, memory usage peaks around
> 500 MB and then only lowers to ~ 450 MB after data is freed.
>

What to you mean as "freed"? Have you deleted the cursor and made sure the
gc reclaimed it? The cursor doesn't destroy the internal data until it is
deleted or another query is run (because after fetchall() you can invoke
scroll(0) and return it to Python again). And of course when the data
returned by fetch() is released depends on the client usage. After a big
query you may see memory usage going down as soon as you execute "select 1
from false" because the result is replaced by a smaller one.

> On the other hand, fetching 100 large rows amounts to a 3 GB peak, which
> subsequently falls back to 10 MB.
>
> So is it a problem related to Psycopg itself or rather how Python handles
> memory in general?
>

The only "problem" you may attribute to Psycopg is if you find an unbound
usage of the memory. If you run some piece of code in a loop and see memory
increasing linearly you have found a leak. Otherwise you can attribute the
artefacts you see to the Python GC.

-- Daniele

In response to

Responses

Browse psycopg by date

  From Date Subject
Next Message Damiano Albani 2013-10-30 19:27:43 Re: Understanding memory usage
Previous Message Damiano Albani 2013-10-30 17:24:59 Re: Understanding memory usage