Re: psql: FATAL: the database system is starting up

From: Tom K <tomkcpr(at)gmail(dot)com>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: psql: FATAL: the database system is starting up
Date: 2019-06-01 19:32:55
Message-ID: CAE3EmBD+LN3Mx2W2MbqANt=oaZ6dBDDn-o3a=01naM6at8PZ9w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sat, Jun 1, 2019 at 9:55 AM Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
wrote:

> On 5/31/19 7:53 PM, Tom K wrote:
> >
>
> > There are two places to connect with the Patroni community: on
> github,
> > via Issues and PRs, and on channel #patroni in the PostgreSQL Slack.
> If
> > you're using Patroni, or just interested, please join us.
> >
> >
> > Will post there as well. Thank you. My thinking was to post here first
> > since I suspect the Patroni community will simply refer me back here
> > given that the PostgreSQL errors are originating directly from
> PostgreSQL.
> >
> >
> > That being said, can you start the copied Postgres instance without
> > using the Patroni instrumentation?
> >
> >
> > Yes, that is something I have been trying to do actually. But I hit a
> > dead end with the three errors above.
> >
> > So what I did is to copy a single node's backed up copy of the data
> > files to */data/patroni* of the same node ( this is the psql data
> > directory as defined through patroni ) of the same node then ran this (
> > psql03 = 192.168.0.118 ):
> >
> > # sudo su - postgres
> > $ /usr/pgsql-10/bin/postgres -D /data/patroni
> > --config-file=/data/patroni/postgresql.conf
> > --listen_addresses=192.168.0.118 --max_worker_processes=8
> > --max_locks_per_transaction=64 --wal_level=replica
> > --track_commit_timestamp=off --max_prepared_transactions=0 --port=5432
> > --max_replication_slots=10 --max_connections=100 --hot_standby=on
> > --cluster_name=postgres --wal_log_hints=on --max_wal_senders=10 -d 5
>
> Why all the options?
> That should be covered in postgresql.conf, no?
>
> >
> > This resulted in one of the 3 messages above. Hence the post here. If
> > I can start a single instance, I should be fine since I could then 1)
> > replicate over to the other two or 2) simply take a dump, reinitialize
> > all the databases then restore the dump.
> >
>
> What if you move the recovery.conf file out?

Will try.

>
> The below looks like missing/corrupted/incorrect files. Hard to tell
> without knowing what Patroni did?

Storage disappeared from underneath these clusters. The OS was of course
still in memory making futile attempts to write to disk, which would never
complete.

My best guess is that Patroni or postgress was in the middle of some writes
across the clusters when the failure occurred.

>
> > Using the above procedure I get one of three error messages when using
> > the data files of each node:
> >
> > [ PSQL01 ]
> > postgres: postgres: startup process waiting for 000000010000000000000008
> >
> > [ PSQL02 ]
> > PANIC:replicationcheckpointhas wrong magic 0 instead of 307747550
> >
> > [ PSQL03 }
> > FATAL:syntax error inhistory file:f2W
> >
> > And I can't start any one of them.
> >
> >
> >
> > >
> > > Thx,
> > > TK
> > >
> >
> >
> >
> > --
> > Adrian Klaver
> > adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com>
> >
>
>
> --
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom K 2019-06-01 19:42:42 Re: psql: FATAL: the database system is starting up
Previous Message Adrian Klaver 2019-06-01 13:55:12 Re: psql: FATAL: the database system is starting up