Re: pg_dump error attempting to upgrade from PostgreSQL 10 to PostgreSQL 12

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: "Burgess, Freddie" <Freddie(dot)Burgess(at)maxar(dot)com>
Cc: "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_dump error attempting to upgrade from PostgreSQL 10 to PostgreSQL 12
Date: 2020-11-07 00:10:04
Message-ID: 20201107011004.50b6de0f@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, 5 Nov 2020 21:19:17 +0000
"Burgess, Freddie" <Freddie(dot)Burgess(at)maxar(dot)com> wrote:

> Simple steps:
>
> BACKUP: pg_dump -U postgres -d <database> > sherlock.dmp <- From the
> pg10 instance RESTORE: psql -U postgres -d <database> -1 -f
> sherlock.dmp <- On the pg12 instance
>
> Postgres Log:
>
> free(): invalid pointer
> free(): invalid pointer
> 2020-11-05 14:07:33.784 EST [26] LOG: background worker "parallel
> worker" (PID 150) was terminated by signal 6: Aborted 2020-11-05
> 14:07:33.784 EST [26] LOG: terminating any other active server
> processes 2020-11-05 14:07:33.784 EST [32] WARNING: terminating
> connection because of crash of another server process 2020-11-05
> 14:07:33.784 EST [32] DETAIL: The postmaster has commanded this
> server process to roll back the current transaction and exit, because
> another server process exited abnormally and possibly corrupted
> shared memory. 2020-11-05 14:07:33.784 EST [32] HINT: In a moment
> you should be able to reconnect to the database and repeat your
> command. 2020-11-05 14:07:33.784 EST [61] WARNING: terminating
> connection because of crash of another server process 2020-11-05
> 14:07:33.784 EST [61] DETAIL: The postmaster has commanded this
> server process to roll back the current transaction and exit, because
> another server process exited abnormally and possibly corrupted
> shared memory. 2020-11-05 14:07:33.784 EST [61] HINT: In a moment
> you should be able to reconnect to the database and repeat your
> command. 2020-11-05 14:07:34.699 EST [26] LOG: all server processes
> terminated; reinitializing 2020-11-05 14:07:42.266 EST [154] LOG:
> database system was interrupted; last known up at 2020-11-05 14:06:02
> EST 2020-11-05 14:08:05.855 EST [154] LOG: database system was not
> properly shut down; automatic recovery in progress 2020-11-05
> 14:08:05.859 EST [154] LOG: redo starts at 7E/93B22C8 2020-11-05
> 14:08:15.931 EST [154] LOG: invalid record length at 7F/74ECBE30:
> wanted 24, got 0 2020-11-05 14:08:15.931 EST [154] LOG: redo done at
> 7F/74ECBDF8 2020-11-05 14:08:41.673 EST [26] LOG: database system is
> ready to accept connections
>
> PostgreSQL is installed on a docker container, running on a EC2
> instance with 256 GB of memory
>

It'd be interesting to know what is doing the crashing parallel worker.
Considering it's a background worker, the easiest way is probably
enabling core dumps and inspecting them with gdb. Make sure you have
debug symbols installed and send us the backtrace.

Some basic instructions are in:

https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD#Getting_a_trace_from_a_randomly_crashing_backend

The error message is most likely a random glibc free() error, not sure
where it's coming from or whether it has something to do with docker.

Maybe try preparing a reproducer, i.e. a small database triggering the
issue, which we might use to reproduce the issue on our machines.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Thomas Munro 2020-11-07 01:11:40 Re: pg_dump error attempting to upgrade from PostgreSQL 10 to PostgreSQL 12
Previous Message Alvaro Herrera 2020-11-07 00:01:00 Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop