From: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
---|---|
To: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | DO with a large amount of statements get stuck with high memory consumption |
Date: | 2016-07-12 19:29:10 |
Message-ID: | CAHyXU0x24k3nATzNWswzHSdzk39On0GfgFtbvZD=anSQSBHcNQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I've noticed that pl/pgsql functions/do commands do not behave well
when the statement resolves and frees memory. To be clear:
FOR i in 1..1000000
LOOP
INSERT INTO foo VALUES (i);
END LOOP;
...runs just fine while
BEGIN
INSERT INTO foo VALUES (1);
INSERT INTO foo VALUES (2);
...
INSERT INTO foo VALUES (1000000);
END;
(for the curious, create a script yourself via
copy (
select
'do $$begin create temp table foo(i int);'
union all select
format('insert into foo values (%s);', i) from generate_series(1,1000000) i
union all select 'raise notice ''abandon all hope!''; end; $$;'
) to '/tmp/breakit.sql';
...while consume amounts of resident memory proportional to the number
of statemnts and eventually crash the server. The problem is obvious;
each statement causes a plan to get created and the server gets stuck
in a loop where SPI_freeplan() is called repeatedly. Everything is
working as designed I guess, but when this happens it's really
unpleasant: the query is uncancellable and unterminatable, nicht gut.
A pg_ctl kill ABRT <pid> will do the trick but I was quite astonished
to see linux take a few minutes to clean up the mess (!) on a somewhat
pokey virtualized server with lots of memory. With even as little as
ten thousand statements the cleanup time far exceed the runtime of the
statement block.
I guess the key takeaway here is, "don't do that"; pl/pgsql
aggressively generates plans and turns out to be a poor choice for
bulk loading because of all the plan caching. Having said that, I
can't help but wonder if there should be a (perhaps user configurable)
limit to the amount of SPI plans a single function call should be able
to acquire on the basis you are going to smack into very poor
behaviors in the memory subsystem.
Stepping back, I can't help but wonder what the value of all the plan
caching going on is at all for statement blocks. Loops might comprise
a notable exception, noted. I'd humbly submit though that (relative
to functions) it's much more likely to want to do something like
insert a lot of statements and a impossible to utilize any cached
plans.
This is not an academic gripe -- I just exploded production :-D.
merlin
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2016-07-12 19:42:25 | Re: BUG #14245: Segfault on weird to_tsquery |
Previous Message | Tom Lane | 2016-07-12 19:24:01 | Re: GiST index build versus NaN coordinates |