Re: pg_restore out of memory

From: Sameer Kumar <sameer(dot)kumar(at)ashnik(dot)com>
To: Miguel Ramos <org(dot)postgresql(at)miguel(dot)ramos(dot)name>, pgsql-general(at)postgresql(dot)org
Subject: Re: pg_restore out of memory
Date: 2016-07-12 13:08:02
Message-ID: CADp-Sm5pkte71C6nWSte5HNidoQNhvvpgZKZ=9y=cks6+2dqoQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, 12 Jul 2016, 7:25 p.m. Miguel Ramos, <
org(dot)postgresql(at)miguel(dot)ramos(dot)name> wrote:

>
> Hi,
>
> We have backed up a database and now when trying to restore it to the
> same server we get this:
>
> > # pg_restore -d recovery /mnt/paysdeloire2013_convertida2.1.dump
> > pg_restore: [custom archiver] out of memory
> > 12:09:56.58 9446.593u+1218.508s 24.3% 167+2589k 6+0io 0pf+0sw
> 6968822cs
>
>
> Some information about the application:
>
> - We have sensor data, including pictures, and number crunshing output,
> then so the large tables on this database have 319, 279, 111 and 26GB.
> Mostly on TOAST pages, but the 279GB one divides it evenly. This
> database is 765GB. We try to keep them under 4TB.
> - Transactions are large, some 100 MB at a time.
> - We also use PostGIS.
>
> About the server (dedicated):
>
> - FreeBSD 9.1-RELEASE #0 on AMD64
> - 16 GB of RAM
> - 8x3GB hardware RAID 10
> - 10TB slice for pgdata UFS-formatted and 32kB block
> - PostgreSQL 9.1.8 custom compiled to get 32kB blocks
> - Installed in 2013 with ~10 people working with it, 145 days uptime today.
>
> I found two relevant threads on the mailing-lists.
> The most recent one sugested that postgresql was being configured to use
> more memory than what's available.
> The older one sugested that the system limits on the size of the data or
> stack segments were lower than required.
>
> So here are some server parameters (relevant or otherwise):
>
> > max_connections = 100
> > shared_buffers = 4GB -- 25% of RAM
> > temp_buffers = 32MB -- irrelevant?
> > work_mem = 64MB
> > maintenance_work_mem = was 1G lowered to 256M then 64M
>

Why did you lower it? I think increasing it should help better. But 1GB
seems like fine.

> wal_buffers = -1 -- should mean 1/32 of shared_buffers = 128MB
>

Increase this during the restore, may be 512MB

> checkpoint_segments = 64 -- WAL segments are 16MB
> > effective_cache_size = 8GB -- irrelevant?
>
>
> I suspect that the restore fails when constructing the indices. After
> the process is aborted, the data appears to be all or most there, but no
> indices.
>

What is logged in database log files? Have you checked that?

So, all I did so far, was lowering maintenance_work_mem and it didn't work.
>
> System limits, as you can see, are at defaults (32GB for data and 512MB
> for stack):
>
> > # limit
> > cputime unlimited
> > filesize unlimited
> > datasize 33554432 kbytes
> > stacksize 524288 kbytes
> > coredumpsize unlimited
> > memoryuse unlimited
> > vmemoryuse unlimited
> > descriptors 11095
> > memorylocked unlimited
> > maxproc 5547
> > sbsize unlimited
> > swapsize unlimited
>
> Shared memory is configured to allow for the single shared memory
> segment postgresql appears to use, plus a bit of extra (8GB):
>
> > # ipcs -M
> > shminfo:
> > shmmax: 8589934592 (max shared memory segment size)
> > shmmin: 1 (min shared memory segment size)
> > shmmni: 192 (max number of shared memory
> identifiers)
> > shmseg: 128 (max shared memory segments per process)
> > shmall: 2097152 (max amount of shared memory in pages)
>
> And semaphores (irrelevant?)...
>
> > # ipcs -S
> > seminfo:
> > semmni: 256 (# of semaphore identifiers)
> > semmns: 512 (# of semaphores in system)
> > semmnu: 256 (# of undo structures in system)
> > semmsl: 340 (max # of semaphores per id)
> > semopm: 100 (max # of operations per semop call)
> > semume: 50 (max # of undo entries per process)
> > semusz: 632 (size in bytes of undo structure)
> > semvmx: 32767 (semaphore maximum value)
> > semaem: 16384 (adjust on exit max value)
>

What are your vm.dirty_ratio and vm.dirty_background_ratio? I think
reducing them may help. But can not really say what exactly would help
unless you are able to get the error source in db logs

>
>
> I don't know what else to try.
> I lowered maintenance_work_mem without restarting the server.
> In some of the attempts, but not all, the restore was done while people
> were working.
>
> Each attempt takes 12 hours...
> We couldn't use the directory -Fd dump because it's postgresql 9.1.
> The original database is still on the server, this is a test restore.
>
> We have about one or two months of slack before we really need to remove
> them from the server to recover space.
>
>
> --
> Miguel Ramos
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
--
--
Best Regards
Sameer Kumar | DB Solution Architect
*ASHNIK PTE. LTD.*

101 Cecil Street, #11-11 Tong Eng Building, Singapore 069 533

T: +65 6438 3504 | M: +65 8110 0350 | www.ashnik.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2016-07-12 14:05:32 Re: Slow SQL?
Previous Message Albe Laurenz 2016-07-12 12:40:16 Re: Slow SQL?