Re: PostgreSQL mirroring from RPM install to RPM install-revisited

From: Richard Brosnahan <broz(at)mac(dot)com>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: PostgreSQL mirroring from RPM install to RPM install-revisited
Date: 2017-02-18 02:50:05
Message-ID: 856A7E1A-1F59-4237-A2E3-9E49242BFC68@mac.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi again Adrian,

Facepalm...

The master server was not installed by me. I was assured by the installer guy that it was version 9.4.1 and 64 bit.

Facepalm... I managed to get enough access to that server to discover they had installed the 32 bit version of PostgreSQL. Who knows why? This explains everything about my issues with the 64 bit PostgreSQL on the slave. It's difficult to get access to our servers, so try not to blame me and think "Why didn't he do that first?" Still, I should have tried harder to get access.

In the PostgreSQL documentation, it clearly states that the two servers have to be the same architecture (both 32 bit or both 64 bit). Further, when Google searching for the errors I see, I find a number of people with similar issues, and they were fighting with 32 bit vs 64 bit PostgreSQLs.

I wasted a LOT of time trying to track this down. I'm sorry I wasted other people's time too.

Anyhow, I uninstalled PostgreSQL on the slave, and reinstalled the 32 bit version. Then I followed the instructions for setting up the slave, and it all works.

Plenty to do, including setting up proper monitoring, and documentation. It's great we have a hot standby, but if nobody knows how to use it in case the master goes away, it's not so great.

THANK YOU for your assistance!

> On Feb 17, 2017, at 10:43 AM, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> wrote:
>
> On 02/16/2017 04:39 PM, Richard Brosnahan wrote:
>> Hi all,
>>
>> Way back in December I posted a question about mirroring from an RPM
>> installed PostgreSQL (binary) to a source built PostgreSQL, with the
>> same version (9.4.1 --> 9.4.1). Both servers are running OEL6.
>
> I went back to the previous threads and I could not find if you ever said whether the two systems are using the same hardware architecture or not? Vincent Veyron asked but I can't find a response.
>
>>
>> I won't copy the entire thread from before, as the situation has changed
>> a bit. The biggest changes are that I have root on the slave,
>> temporarily, and I've installed PostgreSQL on the slave using yum (also
>> binary).
>>
>> I've followed all the instructions found here:
>>
>> https://www.postgresql.org/docs/9.4/static/warm-standby.html#STREAMING-REPLICATION
>>
>>
>> The slave is running PostgreSQL 9.4.11 and was installed using yum.
>> It runs fine after I've run initdb and set things up. The master was
>> also installed from rpm binaries, but the installers used Puppet. That
>> version is 9.4.1. Yes, I know I should be using the exact same version,
>> but I couldn't find 9.4.1 in the PostgreSQL yum repo.
>>
>>
>> When I replace its data directory as part of the mirroring instructions,
>> using pg_basebackup, PostgreSQL won't start. I used pg_basebackup.
>>
>>
>> I get a checksum error, from pg_ctl.
>>
>> 2016-12-15 08:27:14.520 PST >FATAL: incorrect checksum in control file
>>
>>
>> Previously, Tom Lane suggested I try this:
>>
>> You could try using pg_controldata to compare the pg_control contents;
>>
>> it should be willing to print field values even if it thinks the checksum
>>
>> is bad. It would be interesting to see (a) what the master's
>>
>> pg_controldata prints about its pg_control, (b) what the slave's
>>
>> pg_controldata prints about pg_control from a fresh initdb there, and
>>
>> (c) what the slave's pg_controldata prints about the copied pg_control.
>>
>>
>> For Tom's requests (a and b), I can provide good output from
>> pg_controldata from the master with production data, and from the slave
>> right after initdb. I'll provide that on request.
>>
>>
>> for Tom's request (c) I get this from the slave, after data is copied.
>>
>> $ pg_controldata
>>
>> WARNING: Calculated CRC checksum does not match value stored in file.
>>
>> Either the file is corrupt, or it has a different layout than this program
>>
>> is expecting. The results below are untrustworthy.
>>
>>
>> Segmentation fault (core dumped)
>>
>>
>> With this new installation on the slave, same result. core dump
>>
>>
>> Tom Lane then suggested:
>>
>> $ gdb path/to/pg_controldata
>>
>> gdb> run /apps/database/postgresql-data
>>
>> (wait
>>
>> for it to report segfault)
>>
>> gdb> bt
>>
>>
>> Since I now have gdb, I can do that:
>>
>> $ gdb /usr/pgsql-9.4/bin/pg_controldata
>>
>> -bash: gdb: command not found
>>
>> -bash-4.1$ gdb /usr/pgsql-9.4/bin/pg_controldata
>>
>> GNU gdb (GDB) Red Hat Enterprise Linux (7.2-90.el6)
>>
>> Copyright (C) 2010 Free Software Foundation, Inc.
>>
>> License GPLv3+: GNU GPL version 3 or later
>> <http://gnu.org/licenses/gpl.html>
>>
>> This is free software: you are free to change and redistribute it.
>>
>> There is NO WARRANTY, to the extent permitted by law. Type "show copying"
>>
>> and "show warranty" for details.
>>
>> This GDB was configured as "x86_64-redhat-linux-gnu".
>>
>> For bug reporting instructions, please see:
>>
>> <http://www.gnu.org/software/gdb/bugs/>...
>>
>> Reading symbols from /usr/pgsql-9.4/bin/pg_controldata...(no debugging
>> symbols found)...done.
>>
>> Missing separate debuginfos, use: debuginfo-install
>> postgresql94-server-9.4.11-1PGDG.rhel6.x86_64
>>
>> (gdb) run /var/lib/pgsql/9.4/data
>>
>> Starting program: /usr/pgsql-9.4/bin/pg_controldata /var/lib/pgsql/9.4/data
>>
>> WARNING: Calculated CRC checksum does not match value stored in file.
>>
>> Either the file is corrupt, or it has a different layout than this program
>>
>> is expecting. The results below are untrustworthy.
>>
>>
>>
>> Program received signal SIGSEGV, Segmentation fault.
>>
>> 0x00000033d20a3a15 in __strftime_internal () from /lib64/libc.so.6
>>
>> (gdb) bt
>>
>> #0 0x00000033d20a3a15 in __strftime_internal () from /lib64/libc.so.6
>>
>> #1 0x00000033d20a5a36 in strftime_l () from /lib64/libc.so.6
>>
>> #2 0x00000000004015c7 in ?? ()
>>
>> #3 0x00000033d201ed1d in __libc_start_main () from /lib64/libc.so.6
>>
>> #4 0x0000000000401349 in ?? ()
>>
>> #5 0x00007fffffffe518 in ?? ()
>>
>> #6 0x000000000000001c in ?? ()
>>
>> #7 0x0000000000000002 in ?? ()
>>
>> #8 0x00007fffffffe751 in ?? ()
>>
>> #9 0x00007fffffffe773 in ?? ()
>>
>> #10 0x0000000000000000 in ?? ()
>>
>> (gdb)
>>
>>
>> pg_controldata shouldn't be core dumping.
>>
>>
>> Should I give up trying to use 9.4.1 and 9.4.11 as master/slave?
>>
>> My options appear to be
>>
>> 1 upgrade the master to 9.4.11, which will be VERY DIFFICULT given its
>> Puppet install, and the difficulty I have getting root access to our
>> servers.
>>
>> 2 Downgrade the slave. This is easier than option 1, but I would need to
>> find a yum repo that has that version.
>>
>> 3 Make what I have work, somehow.
>>
>> Any assistance would be greatly appreciated!
>>
>> --
>>
>> Richard Brosnahan
>>
>
>
> --
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Scott Marlowe 2017-02-18 04:32:05 Re: Autovacuum stuck for hours, blocking queries
Previous Message Arnold Somogyi 2017-02-18 00:53:45 Multiply ON CONFLICT ON CONSTRAINT