From: | Shane Ambler <pgsql(at)Sheeky(dot)Biz> |
---|---|
To: | PostgreSQL General <pgsql-general(at)postgresql(dot)org> |
Subject: | out of memory with INSERT INTO... SELECT... |
Date: | 2008-01-15 18:53:07 |
Message-ID: | 478D0113.90007@Sheeky.Biz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Not real sure if this is an issue but from what I figure there is too
much RAM being chewed up from this.
The end result is an out of memory error
(I haven't delved deeper as yet)
So I am replicating what someone else is failing to get working in
sqlite to see what pg can do.
The end scenario is the generation of unique codes of 7 alphanumeric
characters in length. Not sure how many he really needs but is trying
for 30 million. (initial testing on the assumption that select distinct
is faster then insert into unique index column)
So anyway I created the table
CREATE TABLE codes (pincode text);
and then came up with an insert query instead of his client looping
through etc..
INSERT INTO codes
SELECT pincode FROM
(
SELECT generate_series(1,1000000) AS idx
, substring('abcdefghijklmnopqrstuvwxyz0123456789' from
cast((random()*36)as integer) for 1)
||substring('abcdefghijklmnopqrstuvwxyz0123456789' from
cast((random()*36)as integer) for 1)
||substring('abcdefghijklmnopqrstuvwxyz0123456789' from
cast((random()*36)as integer) for 1)
||substring('abcdefghijklmnopqrstuvwxyz0123456789' from
cast((random()*36)as integer) for 1)
||substring('abcdefghijklmnopqrstuvwxyz0123456789' from
cast((random()*36)as integer) for 1)
||substring('abcdefghijklmnopqrstuvwxyz0123456789' from
cast((random()*36)as integer) for 1)
||substring('abcdefghijklmnopqrstuvwxyz0123456789' from
cast((random()*36)as integer) for 1)
AS pincode
) AS pcodetbl;
generate_series gives 1M rows
Now that finishes fine - uses about 700MB of RAM (VSZ) but works.
If I change the generate_series to 10M rows it gets an out of memory
error at about 3.5GB (VSZ) and a bit under 300MB(RSS)
(this is from the client connection process not the writer etc)
from ps aux just before ending -
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
pgsql 14519 32.5 -13.8 3366412 290064 ?? Rs 3:39AM 2:22.20
postgres: pgsql postgres ::1(64645) INSERT
OK so it hits the 32 bit limit and it quits gracefully and doesn't cause
any real problem but I seem to think that the memory being allocated is
somewhat over-sized, unless I'm just missing something?
I figure that the subselect is built up in RAM then fed into the INSERT.
From what I add up, with 7 characters per row plus 4 for the sequence
and a few extra overheads, I would think less than 20 bytes per row * 1M
rows makes about 20MB, with 10M rows it goes to 200MB
That falls a long way short of what is being allocated to pg.
For ref :-
postgres=# select version();
version
--------------------------------------------------------------------
PostgreSQL 8.2.5 on powerpc-apple-darwin8.10.0, compiled by GCC
powerpc-apple-darwin8-gcc-4.0.1 (GCC) 4.0.1 (Apple Computer, Inc. build
5367)
(1 row)
Mac OSX 10.4.11
G4 dual 1.25G
2GB RAM
--
Shane Ambler
pgSQL (at) Sheeky (dot) Biz
Get Sheeky @ http://Sheeky.Biz
From | Date | Subject | |
---|---|---|---|
Next Message | Gregory Stark | 2008-01-15 19:01:32 | Re: Index trouble with 8.3b4 |
Previous Message | Kico Zaninetti | 2008-01-15 18:53:03 | LIKE and REGEX optimization |