Re: json(b)_array_elements use causes very large memory usage when also referencing entire json document

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Lucas Fairchild-Madar <lucas(dot)madar(at)gmail(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: json(b)_array_elements use causes very large memory usage when also referencing entire json document
Date: 2017-10-06 19:37:03
Message-ID: 4514.1507318623@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Lucas Fairchild-Madar <lucas(dot)madar(at)gmail(dot)com> writes:
> Sure, here's a bash script. Assumes you have PGDATABASE, etc set.
> Verified this also blows up on pg 10. I haven't verified the new patch.

Oh, I see the problem. This is a different animal, because it's actually
an intra-row leak, as it were. What you've got is

select data->'id', jsonb_array_elements(data->'items') from kaboom;

where the SRF jsonb_array_elements() emits a lot of values. For each
of those values, data->'id' gets evaluated over again, and we can't
reclaim memory in the per-tuple context until we've finished the whole
cycle for the current row of "kaboom". So a leak would occur in any
case ... but it's particularly awful in this case, because data->'id'
involves detoasting the rather wide value of "data", which is then
promptly leaked. So the total memory consumption is more or less
proportional to O(N^2) in the length of "data".

This has been like this since forever, and it's probably impractical
to do anything about it pre-v10, given the unstructured way that
targetlist SRFs are handled. You could dodge the problem by moving
the SRF to a lateral FROM item:

select data->'id', ja
from kaboom, lateral jsonb_array_elements(data->'items') as ja;

(The LATERAL keyword is optional here, but I like it because it
makes it clearer what's happening.)

As of v10, it might be possible to fix this for the tlist case
as well, by doing something like using a separate short-lived
context for the non-SRF tlist items.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Lucas Fairchild-Madar 2017-10-06 20:03:52 Re: json(b)_array_elements use causes very large memory usage when also referencing entire json document
Previous Message Lucas Fairchild-Madar 2017-10-06 19:11:40 Re: json(b)_array_elements use causes very large memory usage when also referencing entire json document