PANIC: could not write to file "pg_wal/xlogtemp.11399": No space left on device

From: Jason Ralph <jralph(at)affinitysolutions(dot)com>
To: "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: PANIC: could not write to file "pg_wal/xlogtemp.11399": No space left on device
Date: 2019-09-13 15:10:50
Message-ID: BL0PR04MB6499BC32502F223A7BDF9BF2D0B30@BL0PR04MB6499.namprd04.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello list,
[10:47:13] [postgres(at)host] $ psql --version
psql (PostgreSQL) 11.5
[11:06:36] [postgres(at)host] $ cat /etc/redhat-release
CentOS release 6.10 (Final)
[11:06:33] [postgres(at)host] $ uname -a
Linux host 2.6.32-754.18.2.el6.x86_64 #1 SMP Wed Aug 14 16:26:59 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

I have been in the process of upgrading my pg9.3 systems to pg11.5, all went well with my upgrade using the pg_upgrade method. Once the upgrade was complete I ran the following:
/usr/pgsql-11/bin/vacuumdb -v -j 6 --all --analyze

I have pg_wal on a separate partition that ran out of space, when I saw this, I issued what I think was a premature (service postgresql-11 stop), then moved the pg_wal log to a larger partition and started postgres, the database did recover after some time of the recovering process running which is great!

My question is should I be concerned with data loss when a vacuumdb with analyze crashed due to space, and I restarted the database possibly when it was trying to recover? The hinted message at 2019-09-12 23:26:15.978 in the log scares me a bit.

2019-09-12 23:15:23.519 EDT [10101] ERROR: canceling autovacuum task
2019-09-12 23:15:23.519 EDT [10101] CONTEXT: automatic vacuum of table "famnet5.public.private_tablename"
2019-09-12 23:24:51.640 EDT [11399] PANIC: could not write to file "pg_wal/xlogtemp.11399": No space left on device
2019-09-12 23:24:51.640 EDT [11399] CONTEXT: writing block 288 of relation base/16402/4190122
2019-09-12 23:24:51.640 EDT [11399] STATEMENT: VACUUM (VERBOSE, ANALYZE) public.private_table_name;
2019-09-12 23:24:51.686 EDT [9792] LOG: server process (PID 11399) was terminated by signal 6: Aborted
2019-09-12 23:24:51.686 EDT [9792] DETAIL: Failed process was running: VACUUM (VERBOSE, ANALYZE) public.private_tablename;
2019-09-12 23:24:51.687 EDT [9792] LOG: terminating any other active server processes
2019-09-12 23:24:51.689 EDT [11396] WARNING: terminating connection because of crash of another server process
2019-09-12 23:24:51.689 EDT [11396] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupte
d shared memory.
2019-09-12 23:24:51.689 EDT [11396] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2019-09-12 23:24:51.689 EDT [9799] WARNING: terminating connection because of crash of another server process
2019-09-12 23:24:51.689 EDT [9799] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted
shared memory.
2019-09-12 23:24:51.689 EDT [9799] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2019-09-12 23:24:51.689 EDT [11398] WARNING: terminating connection because of crash of another server process
2019-09-12 23:24:51.689 EDT [11398] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupte
d shared memory.
2019-09-12 23:24:51.689 EDT [11398] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2019-09-12 23:24:51.689 EDT [11397] WARNING: terminating connection because of crash of another server process
2019-09-12 23:24:51.689 EDT [11397] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupte
d shared memory.
2019-09-12 23:24:51.689 EDT [11397] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2019-09-12 23:24:51.691 EDT [12616] LOG: PID 11398 in cancel request did not match any process
2019-09-12 23:24:51.692 EDT [9792] LOG: all server processes terminated; reinitializing
2019-09-12 23:24:51.758 EDT [12618] LOG: PID 11399 in cancel request did not match any process
2019-09-12 23:24:51.758 EDT [12617] LOG: database system was interrupted; last known up at 2019-09-12 23:21:01 EDT
2019-09-12 23:24:54.110 EDT [12617] LOG: database system was not properly shut down; automatic recovery in progress
2019-09-12 23:24:54.121 EDT [12617] LOG: redo starts at 119/3A8EAF50
2019-09-12 23:25:34.712 EDT [9792] LOG: received fast shutdown request
2019-09-12 23:25:34.735 EDT [9792] LOG: abnormal database system shutdown
2019-09-12 23:25:34.779 EDT [9792] LOG: database system is shut down
2019-09-12 23:26:15.978 EDT [12972] LOG: database system was interrupted while in recovery at 2019-09-12 23:24:54 EDT
2019-09-12 23:26:15.978 EDT [12972] HINT: This probably means that some data is corrupted and you will have to use the last backup for recovery.
2019-09-12 23:26:19.213 EDT [12972] LOG: database system was not properly shut down; automatic recovery in progress
2019-09-12 23:26:19.218 EDT [12972] LOG: redo starts at 119/3A8EAF50
2019-09-12 23:32:06.654 EDT [12972] LOG: redo done at 119/73FFFC30
2019-09-12 23:32:06.665 EDT [12972] LOG: last completed transaction was at log time 2019-09-12 23:24:49.792961-04
2019-09-12 23:32:10.087 EDT [12969] LOG: database system is ready to accept connections
2019-09-12 23:53:12.221 EDT [12969] LOG: received fast shutdown request
2019-09-12 23:53:12.226 EDT [12969] LOG: aborting any active transactions
2019-09-12 23:53:12.228 EDT [12969] LOG: background worker "logical replication launcher" (PID 14020) exited with exit code 1
2019-09-12 23:53:12.309 EDT [14015] LOG: shutting down
2019-09-12 23:53:12.407 EDT [12969] LOG: database system is shut down
2019-09-12 23:53:19.029 EDT [16953] LOG: database system was shut down at 2019-09-12 23:53:12 EDT
2019-09-12 23:53:19.041 [16950] LOG: database system is ready to accept connections

Jason Ralph

This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission. If verification is required please request a hard-copy version.

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Paul Jungwirth 2019-09-13 15:19:27 Re: How to handle things that change over time?
Previous Message Olivier Gautherot 2019-09-13 14:44:01 Re: backing up the data from a single table?