Re: psql: FATAL: the database system is starting up

From: Tom K <tomkcpr(at)gmail(dot)com>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: psql: FATAL: the database system is starting up
Date: 2019-06-02 03:07:52
Message-ID: CAE3EmBB2Vut6Tyibj=kdrfDk1jvPXpVvD0gHJ1ysg9A6OD=6PA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sat, Jun 1, 2019 at 8:53 PM Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
wrote:

> On 6/1/19 5:32 PM, Tom K wrote:
>
> >
> >
> > Trying what we did above but on the second node:
> >
>
> Was this node the primary?
>
> To me the below looks like there are replication slots set up that are
> failing. Not sure how to deal with this at the moment. You might try
> single-user mode:
>
> https://www.postgresql.org/docs/10/app-postgres.html
> Single-User Mode
> ...
>
> and see if that at least gets the server started. This is a highly
> restricted so do not expect much usability.
>

These servers did crash before however didn't' notice anything particularly
wrong at the time. Perhaps a better check was in order.

Here is the result from attempting to start the DB in single user mode.

-bash-4.2$ /usr/pgsql-10/bin/postgres --single -D /data/patroni
--config-file=/data/patroni/postgresql.conf --hot_standby=off
--listen_addresses=192.168.0.124 --max_worker_processes=8
--max_locks_per_transaction=64 --wal_level=replica --cluster_name=postgres
--wal_log_hints=on --max_wal_senders=10 --track_commit_timestamp=off
--max_prepared_transactions=0 --port=5432 --max_replication_slots=10
--max_connections=20 -d 5 2>&1
2019-06-01 23:04:20.876 EDT [31831] DEBUG: invoking
IpcMemoryCreate(size=144687104)
2019-06-01 23:04:20.877 EDT [31831] DEBUG: mmap(144703488) with
MAP_HUGETLB failed, huge pages disabled: Cannot allocate memory
2019-06-01 23:04:20.890 EDT [31831] DEBUG: SlruScanDirectory invoking
callback on pg_notify/0000
2019-06-01 23:04:20.890 EDT [31831] DEBUG: removing file "pg_notify/0000"
2019-06-01 23:04:20.890 EDT [31831] DEBUG: dynamic shared memory system
will support 128 segments
2019-06-01 23:04:20.891 EDT [31831] DEBUG: created dynamic shared memory
control segment 846930886 (3088 bytes)
2019-06-01 23:04:20.891 EDT [31831] DEBUG: InitPostgres
2019-06-01 23:04:20.891 EDT [31831] DEBUG: my backend ID is 1
2019-06-01 23:04:20.891 EDT [31831] LOG: database system was interrupted;
last known up at 2019-04-28 06:06:24 EDT
2019-06-01 23:04:21.239 EDT [31831] LOG: invalid record length at
0/4C35CDF8: wanted 24, got 0
2019-06-01 23:04:21.239 EDT [31831] LOG: invalid primary checkpoint record
2019-06-01 23:04:21.239 EDT [31831] LOG: using previous checkpoint record
at 0/4C34EDA8
2019-06-01 23:04:21.239 EDT [31831] DEBUG: redo record is at 0/4C34ED70;
shutdown FALSE
2019-06-01 23:04:21.239 EDT [31831] DEBUG: next transaction ID: 0:1409831;
next OID: 237578
2019-06-01 23:04:21.239 EDT [31831] DEBUG: next MultiXactId: 48; next
MultiXactOffset: 174
2019-06-01 23:04:21.239 EDT [31831] DEBUG: oldest unfrozen transaction ID:
549, in database 1
2019-06-01 23:04:21.239 EDT [31831] DEBUG: oldest MultiXactId: 1, in
database 1
2019-06-01 23:04:21.239 EDT [31831] DEBUG: commit timestamp Xid
oldest/newest: 0/0
2019-06-01 23:04:21.239 EDT [31831] DEBUG: transaction ID wrap limit is
2147484196, limited by database with OID 1
2019-06-01 23:04:21.239 EDT [31831] DEBUG: MultiXactId wrap limit is
2147483648, limited by database with OID 1
2019-06-01 23:04:21.239 EDT [31831] DEBUG: starting up replication slots
2019-06-01 23:04:21.239 EDT [31831] DEBUG: starting up replication origin
progress state
2019-06-01 23:04:21.239 EDT [31831] PANIC: replication checkpoint has
wrong magic 0 instead of 307747550
Aborted
-bash-4.2$
-bash-4.2$ /usr/pgsql-10/bin/postgres --single -D /data/patroni
--config-file=/data/patroni/postgresql.conf
2019-06-01 23:06:28.679 EDT [31969] LOG: database system was interrupted;
last known up at 2019-04-28 06:06:24 EDT
2019-06-01 23:06:28.953 EDT [31969] LOG: invalid record length at
0/4C35CDF8: wanted 24, got 0
2019-06-01 23:06:28.953 EDT [31969] LOG: invalid primary checkpoint record
2019-06-01 23:06:28.953 EDT [31969] LOG: using previous checkpoint record
at 0/4C34EDA8
2019-06-01 23:06:28.953 EDT [31969] PANIC: replication checkpoint has
wrong magic 0 instead of 307747550
Aborted
-bash-4.2$

>
> > Removing the recovery.conf and restarting yields the following messages:
> >
> > 2019-06-01 20:31:12.231 EDT [21910] LOG: database system was
> > interrupted; last known up at 2019-04-28 06:06:24 EDT
> > 2019-06-01 20:31:12.505 EDT [21910] LOG: invalid record length at
> > 0/4C35CDF8: wanted 24, got 0
> > 2019-06-01 20:31:12.505 EDT [21910] LOG: invalid primary checkpoint
> record
> > 2019-06-01 20:31:12.505 EDT [21910] LOG: using previous checkpoint
> > record at 0/4C34EDA8
> > 2019-06-01 20:31:12.505 EDT [21910] DEBUG: redo record is at
> > 0/4C34ED70; shutdown FALSE
> > 2019-06-01 20:31:12.505 EDT [21910] DEBUG: next transaction ID:
> > 0:1409831; next OID: 237578
> > 2019-06-01 20:31:12.505 EDT [21910] DEBUG: next MultiXactId: 48; next
> > MultiXactOffset: 174
> > 2019-06-01 20:31:12.505 EDT [21910] DEBUG: oldest unfrozen transaction
> > ID: 549, in database 1
> > 2019-06-01 20:31:12.505 EDT [21910] DEBUG: oldest MultiXactId: 1, in
> > database 1
> > 2019-06-01 20:31:12.505 EDT [21910] DEBUG: commit timestamp Xid
> > oldest/newest: 0/0
> > 2019-06-01 20:31:12.505 EDT [21910] DEBUG: transaction ID wrap limit is
> > 2147484196, limited by database with OID 1
> > 2019-06-01 20:31:12.505 EDT [21910] DEBUG: MultiXactId wrap limit is
> > 2147483648, limited by database with OID 1
> > 2019-06-01 20:31:12.505 EDT [21910] DEBUG: starting up replication slots
> > 2019-06-01 20:31:12.505 EDT [21910] DEBUG: starting up replication
> > origin progress state
> > 2019-06-01 20:31:12.505 EDT [21910] PANIC: replication checkpoint has
> > wrong magic 0 instead of 307747550
> > 2019-06-01 20:31:12.506 EDT [21908] DEBUG: reaping dead processes
> > 2019-06-01 20:31:12.506 EDT [21908] LOG: startup process (PID 21910)
> > was terminated by signal 6: Aborted
> > 2019-06-01 20:31:12.506 EDT [21908] LOG: aborting startup due to
> > startup process failure
> > 2019-06-01 20:31:12.506 EDT [21908] DEBUG: shmem_exit(1): 0
> > before_shmem_exit callbacks to make
> > 2019-06-01 20:31:12.506 EDT [21908] DEBUG: shmem_exit(1): 5
> > on_shmem_exit callbacks to make
> > 2019-06-01 20:31:12.506 EDT [21908] DEBUG: cleaning up dynamic shared
> > memory control segment with ID 976986759
> > 2019-06-01 20:31:12.508 EDT [21908] DEBUG: proc_exit(1): 2 callbacks to
> > make
> > 2019-06-01 20:31:12.508 EDT [21908] LOG: database system is shut down
> > 2019-06-01 20:31:12.508 EDT [21908] DEBUG: exit(1)
> > 2019-06-01 20:31:12.508 EDT [21908] DEBUG: shmem_exit(-1): 0
> > before_shmem_exit callbacks to make
> > 2019-06-01 20:31:12.508 EDT [21908] DEBUG: shmem_exit(-1): 0
> > on_shmem_exit callbacks to make
> > 2019-06-01 20:31:12.508 EDT [21908] DEBUG: proc_exit(-1): 0 callbacks
> > to make
> > 2019-06-01 20:31:12.510 EDT [21909] DEBUG: logger shutting down
> > 2019-06-01 20:31:12.510 EDT [21909] DEBUG: shmem_exit(0): 0
> > before_shmem_exit callbacks to make
> > 2019-06-01 20:31:12.510 EDT [21909] DEBUG: shmem_exit(0): 0
> > on_shmem_exit callbacks to make
> > 2019-06-01 20:31:12.510 EDT [21909] DEBUG: proc_exit(0): 0 callbacks to
> > make
> > 2019-06-01 20:31:12.510 EDT [21909] DEBUG: exit(0)
> > 2019-06-01 20:31:12.510 EDT [21909] DEBUG: shmem_exit(-1): 0
> > before_shmem_exit callbacks to make
> > 2019-06-01 20:31:12.510 EDT [21909] DEBUG: shmem_exit(-1): 0
> > on_shmem_exit callbacks to make
> > 2019-06-01 20:31:12.510 EDT [21909] DEBUG: proc_exit(-1): 0 callbacks
> > to make
> >
> > --
> > Adrian Klaver
> > adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com>
> >
>
>
> --
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Peter J. Holzer 2019-06-02 07:37:32 Re: Questions about btree_gin vs btree_gist for low cardinality columns
Previous Message Adrian Klaver 2019-06-02 00:53:31 Re: psql: FATAL: the database system is starting up