Re: Failing Multi-Job Restores, Missing Indexes on Restore

From: Cea Stapleton <cea(at)healthfinch(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org, eli(at)healthfinch(dot)com
Subject: Re: Failing Multi-Job Restores, Missing Indexes on Restore
Date: 2016-09-29 12:56:43
Message-ID: 8FBA7790-5E54-4046-B466-C030E4369580@healthfinch.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Thanks Tom!

We’re using pg_restore (PostgreSQL) 9.5.4 for the restores. We’ve used variations on the job number:

/usr/bin/pg_restore -j 6 -Fc -O -c -d DBNAME RESTORE_FILE”

We’ll take a look at the memory overcommit - would that also explain the index issues we were seeing before we were seeing the crashes?

Cea Stapleton
Operations Engineer
http://www.healthfinch.com

> On Sep 29, 2016, at 7:52 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Cea Stapleton <cea(at)healthfinch(dot)com> writes:
>> We are having a baffling problem we hope you might be able to help with. We were hoping to speed up postgres restores to our reporting server. First, we were seeing missing indexes with pg_restore to our reporting server for one of our databases when we did pg_restore with multiple jobs (a clean restore, we also tried dropping the database prior to restore, just in case something was extant and amiss). The indexes missed were not consistent, and we were only ever seeing errors on import that indicated an index had not yet been built. For example:
>
>> pg_restore: [archiver (db)] could not execute query: ERROR: index "index_versions_on_item_type_and_item_id" does not exist
>> Command was: DROP INDEX public.index_versions_on_item_type_and_item_id;
>
> Which PG version is that; particularly, which pg_restore version?
> What's the exact pg_restore command you were issuing?
>
>> We decided to move back to a multi-job regular restore, and then the restores began crashing thusly:
>> [2016-09-14 02:20:36 UTC] LOG: server process (PID 27624) was terminated by signal 9: Killed
>
> This is probably the dreaded Linux OOM killer. Fix by reconfiguring your
> system to disallow memory overcommit, or at least make it not apply to
> Postgres, cf
> https://www.postgresql.org/docs/9.5/static/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT
>
> regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2016-09-29 13:09:16 Re: Failing Multi-Job Restores, Missing Indexes on Restore
Previous Message Tom Lane 2016-09-29 12:52:31 Re: Failing Multi-Job Restores, Missing Indexes on Restore