Re: psql: FATAL: the database system is starting up

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Tom K <tomkcpr(at)gmail(dot)com>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: psql: FATAL: the database system is starting up
Date: 2019-05-29 14:28:24
Message-ID: 76851993-4826-f300-85ac-42d8b705a56f@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 5/28/19 6:59 PM, Tom K wrote:
>
>
> On Tue, May 28, 2019 at 9:53 AM Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com
> <mailto:adrian(dot)klaver(at)aklaver(dot)com>> wrote:
>

>
> Correct.  Master election occurs through Patroni.  WAL level is set to:
>
> wal_level = 'replica'
>
> So no archiving.
>
>
>
> >
> > After the most recent crash 2-3 weeks ago, the cluster is now
> running
> > into this message but I'm not able to make heads or tails out of why
> > it's throwing this:
>
> So you have not been able to run the cluster the past 2-3 weeks or is
> that  more recent?
>
>
> Haven't been able to bring this PostgresSQL cluster up ( run the cluster
> ) since 2-3 weeks ago.  Tried quite a few combinations of options to
> recover this.  No luck.  Had storage failures earlier, even with
> corrupted OS files, but this PostgreSQL cluster w/ Patroni was able to
> come up each time without any recovery effort on my part.
>
>
> When you refer to history files below are you talking about WAL
> files or
> something else?
>
> Is this:
>
> "recovery command file "recovery.conf" specified neither
> primary_conninfo nor restore_command"
>
> true?
>
>
> True. recovery.conf is controlled by Patroni.  Contents of this file
> remained the same for all the cluster nodes with the exception of the
> primary_slot_name:
>
> [root(at)psql01 postgresql-patroni-etcd]# cat recovery.conf
> primary_slot_name = 'postgresql0'
> standby_mode = 'on'
> recovery_target_timeline = 'latest'
> [root(at)psql01 postgresql-patroni-etcd]#
>
> [root(at)psql02 postgres-backup]# cat recovery.conf
> primary_slot_name = 'postgresql1'
> standby_mode = 'on'
> recovery_target_timeline = 'latest'
> [root(at)psql02 postgres-backup]#
>
> [root(at)psql03 postgresql-patroni-backup]# cat recovery.conf
> primary_slot_name = 'postgresql2'
> standby_mode = 'on'
> recovery_target_timeline = 'latest'
> [root(at)psql03 postgresql-patroni-backup]#
>
> I've made a copy of the root postgres directory over to another location
> so when troubleshooting, I can always revert to the first state the
> cluster was in when it failed.

I have no experience with Patroni so I will be of no help there. You
might get more useful information from:

https://github.com/zalando/patroni
Community

There are two places to connect with the Patroni community: on github,
via Issues and PRs, and on channel #patroni in the PostgreSQL Slack. If
you're using Patroni, or just interested, please join us.

That being said, can you start the copied Postgres instance without
using the Patroni instrumentation?

>
> Thx,
> TK
>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ron 2019-05-29 15:05:36 Re: with and trigger
Previous Message a venkatesh 2019-05-29 14:23:50 Re: Query reg. postgresql 9.6 migration from ubuntu 16.04 to 18.04