Re: Postgres Crashing

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Doug Roberts <h205881(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Postgres Crashing
Date: 2020-02-04 16:18:13
Message-ID: 86514399-4198-ad0a-68de-0ba111ec65c3@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 2/4/20 8:06 AM, Doug Roberts wrote:
> Hello,
>
> Here is a stacktrace of what happened before and after the crash.

Actually the below is the Postgres log. Per Tom's previous post the
procedure to get a stack trace can be found here:

https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

>
> Thanks,
>
> Doug
>
> 2020-02-04 10:26:16.841 EST [20788] [0] LOG:  00000: server process (PID
> 12168) was terminated by exception 0xC0000005
> 2020-02-04 10:26:16.841 EST [20788] [0] DETAIL:  Failed process was
> running: select CONTAINERS_RESET_RECIRC_BY_DP(3000)
> 2020-02-04 10:26:16.841 EST [20788] [0] HINT:  See C include file
> "ntstatus.h" for a description of the hexadecimal value.
> 2020-02-04 10:26:16.841 EST [20788] [0] LOCATION:  LogChildExit,
> postmaster.c:3670
> 2020-02-04 10:26:16.841 EST [20788] [0] LOG:  00000: terminating any
> other active server processes
> 2020-02-04 10:26:16.841 EST [20788] [0] LOCATION:  HandleChildCrash,
> postmaster.c:3400
> 2020-02-04 10:26:16.873 EST [1212] [0] WARNING:  57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.873 EST [1212] [0] DETAIL:  The postmaster has
> commanded this server process to roll back the current transaction and
> exit, because another server process exited abnormally and possibly
> corrupted shared memory.
> 2020-02-04 10:26:16.873 EST [1212] [0] HINT:  In a moment you should be
> able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.873 EST [1212] [0] LOCATION:  quickdie, postgres.c:2717
> 2020-02-04 10:26:16.873 EST [19436] [0] WARNING:  57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.873 EST [19436] [0] DETAIL:  The postmaster has
> commanded this server process to roll back the current transaction and
> exit, because another server process exited abnormally and possibly
> corrupted shared memory.
> 2020-02-04 10:26:16.873 EST [19436] [0] HINT:  In a moment you should be
> able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.873 EST [19436] [0] LOCATION:  quickdie, postgres.c:2717
> 2020-02-04 10:26:16.874 EST [13428] [0] WARNING:  57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.874 EST [13428] [0] DETAIL:  The postmaster has
> commanded this server process to roll back the current transaction and
> exit, because another server process exited abnormally and possibly
> corrupted shared memory.
> 2020-02-04 10:26:16.874 EST [13428] [0] HINT:  In a moment you should be
> able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.874 EST [13428] [0] CONTEXT:  while locking tuple
> (0,115) in relation "containers"
> SQL statement "UPDATE containers
>            SET type_uid = COALESCE(declared_type_uid, type_uid),
>                carton_type_uid = COALESCE(declared_carton_type_uid,
> carton_type_uid),
>                status_uid = COALESCE(declared_status_uid, status_uid),
>                order_uid = COALESCE(in_order_uid, order_uid),
>                wave_uid = COALESCE(in_wave_uid, wave_uid),
>                length = COALESCE(in_length, carton_length, length),
>                width = COALESCE(in_width, carton_width, width),
>                height = COALESCE(in_height, carton_height, height),
>                weight = COALESCE(in_weight, weight),
>                weight_minimum = COALESCE(in_weight_minimum,
> weight_minimum),
>                weight_maximum = COALESCE(in_weight_maximum,
> weight_maximum),
>                weight_expected = COALESCE(in_weight_expected,
> weight_expected),
>                first_seen_DP_id = COALESCE(first_seen_DP_id,
> in_last_seen_DP_id),
>                first_seen_datetime = COALESCE(first_seen_datetime,
> last_seen_date_time),
>                last_seen_DP_id = COALESCE(in_last_seen_DP_id,
> last_seen_DP_id),
>                last_seen_datetime = COALESCE(last_seen_date_time,
> last_seen_datetime),
>                recirculation_count = COALESCE(in_recirculation_count,
> recirculation_count),
>                project_flags = COALESCE(in_project_flags, project_flags),
>                passed_weight_check = COALESCE(in_passed_weight_check,
> passed_weight_check)
>            WHERE uid = in_uid"
> PL/pgSQL function
> containers_add_update(integer,integer,integer,integer,integer,integer,double
> precision,double precision,double precision,double precision,double
> precision,double precision,double precision,integer,timestamp without
> time zone,character varying,bigint,boolean) line 60 at SQL statement
> 2020-02-04 10:26:16.874 EST [13428] [0] LOCATION:  quickdie, postgres.c:2717
> 2020-02-04 10:26:16.874 EST [25916] [0] WARNING:  57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.874 EST [25916] [0] DETAIL:  The postmaster has
> commanded this server process to roll back the current transaction and
> exit, because another server process exited abnormally and possibly
> corrupted shared memory.
> 2020-02-04 10:26:16.874 EST [25916] [0] HINT:  In a moment you should be
> able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.874 EST [25916] [0] CONTEXT:  while locking tuple
> (1,91) in relation "containers"
> SQL statement "UPDATE containers
>            SET type_uid = COALESCE(declared_type_uid, type_uid),
>                carton_type_uid = COALESCE(declared_carton_type_uid,
> carton_type_uid),
>                status_uid = COALESCE(declared_status_uid, status_uid),
>                order_uid = COALESCE(in_order_uid, order_uid),
>                wave_uid = COALESCE(in_wave_uid, wave_uid),
>                length = COALESCE(in_length, carton_length, length),
>                width = COALESCE(in_width, carton_width, width),
>                height = COALESCE(in_height, carton_height, height),
>                weight = COALESCE(in_weight, weight),
>                weight_minimum = COALESCE(in_weight_minimum,
> weight_minimum),
>                weight_maximum = COALESCE(in_weight_maximum,
> weight_maximum),
>                weight_expected = COALESCE(in_weight_expected,
> weight_expected),
>                first_seen_DP_id = COALESCE(first_seen_DP_id,
> in_last_seen_DP_id),
>                first_seen_datetime = COALESCE(first_seen_datetime,
> last_seen_date_time),
>                last_seen_DP_id = COALESCE(in_last_seen_DP_id,
> last_seen_DP_id),
>                last_seen_datetime = COALESCE(last_seen_date_time,
> last_seen_datetime),
>                recirculation_count = COALESCE(in_recirculation_count,
> recirculation_count),
>                project_flags = COALESCE(in_project_flags, project_flags),
>                passed_weight_check = COALESCE(in_passed_weight_check,
> passed_weight_check)
>            WHERE uid = in_uid"
> PL/pgSQL function
> containers_add_update(integer,integer,integer,integer,integer,integer,double
> precision,double precision,double precision,double precision,double
> precision,double precision,double precision,integer,timestamp without
> time zone,character varying,bigint,boolean) line 60 at SQL statement
> 2020-02-04 10:26:16.874 EST [25916] [0] LOCATION:  quickdie, postgres.c:2717
> 2020-02-04 10:26:16.875 EST [2512] [0] WARNING:  57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.875 EST [2512] [0] DETAIL:  The postmaster has
> commanded this server process to roll back the current transaction and
> exit, because another server process exited abnormally and possibly
> corrupted shared memory.
> 2020-02-04 10:26:16.875 EST [2512] [0] HINT:  In a moment you should be
> able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.875 EST [2512] [0] CONTEXT:  while locking tuple
> (0,111) in relation "containers"
> SQL statement "UPDATE containers
>            SET type_uid = COALESCE(declared_type_uid, type_uid),
>                carton_type_uid = COALESCE(declared_carton_type_uid,
> carton_type_uid),
>                status_uid = COALESCE(declared_status_uid, status_uid),
>                order_uid = COALESCE(in_order_uid, order_uid),
>                wave_uid = COALESCE(in_wave_uid, wave_uid),
>                length = COALESCE(in_length, carton_length, length),
>                width = COALESCE(in_width, carton_width, width),
>                height = COALESCE(in_height, carton_height, height),
>                weight = COALESCE(in_weight, weight),
>                weight_minimum = COALESCE(in_weight_minimum,
> weight_minimum),
>                weight_maximum = COALESCE(in_weight_maximum,
> weight_maximum),
>                weight_expected = COALESCE(in_weight_expected,
> weight_expected),
>                first_seen_DP_id = COALESCE(first_seen_DP_id,
> in_last_seen_DP_id),
>                first_seen_datetime = COALESCE(first_seen_datetime,
> last_seen_date_time),
>                last_seen_DP_id = COALESCE(in_last_seen_DP_id,
> last_seen_DP_id),
>                last_seen_datetime = COALESCE(last_seen_date_time,
> last_seen_datetime),
>                recirculation_count = COALESCE(in_recirculation_count,
> recirculation_count),
>                project_flags = COALESCE(in_project_flags, project_flags),
>                passed_weight_check = COALESCE(in_passed_weight_check,
> passed_weight_check)
>            WHERE uid = in_uid"
> PL/pgSQL function
> containers_add_update(integer,integer,integer,integer,integer,integer,double
> precision,double precision,double precision,double precision,double
> precision,double precision,double precision,integer,timestamp without
> time zone,character varying,bigint,boolean) line 60 at SQL statement
> 2020-02-04 10:26:16.875 EST [2512] [0] LOCATION:  quickdie, postgres.c:2717
> 2020-02-04 10:26:16.879 EST [14908] [0] WARNING:  57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.879 EST [14908] [0] DETAIL:  The postmaster has
> commanded this server process to roll back the current transaction and
> exit, because another server process exited abnormally and possibly
> corrupted shared memory.
> 2020-02-04 10:26:16.879 EST [14908] [0] HINT:  In a moment you should be
> able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.879 EST [14908] [0] LOCATION:  quickdie, postgres.c:2717
> 2020-02-04 10:26:16.880 EST [7092] [0] WARNING:  57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.880 EST [7092] [0] DETAIL:  The postmaster has
> commanded this server process to roll back the current transaction and
> exit, because another server process exited abnormally and possibly
> corrupted shared memory.
> 2020-02-04 10:26:16.880 EST [7092] [0] HINT:  In a moment you should be
> able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.880 EST [7092] [0] LOCATION:  quickdie, postgres.c:2717
> 2020-02-04 10:26:16.975 EST [14360] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:16.975 EST [14360] [0] LOCATION:  ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:17.033 EST [20788] [0] LOG:  00000: all server
> processes terminated; reinitializing
> 2020-02-04 10:26:17.033 EST [20788] [0] LOCATION:
>  PostmasterStateMachine, postmaster.c:3912
> 2020-02-04 10:26:17.105 EST [20964] [0] LOG:  00000: database system was
> interrupted; last known up at 2020-02-04 10:26:09 EST
> 2020-02-04 10:26:17.105 EST [20964] [0] LOCATION:  StartupXLOG, xlog.c:6277
> 2020-02-04 10:26:17.115 EST [1668] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.115 EST [1668] [0] LOCATION:  ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:17.179 EST [25800] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.179 EST [25800] [0] LOCATION:  ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:17.301 EST [14700] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.301 EST [14700] [0] LOCATION:  ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:17.309 EST [19060] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.309 EST [19060] [0] LOCATION:  ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:17.378 EST [24772] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.378 EST [24772] [0] LOCATION:  ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:17.434 EST [12972] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.434 EST [12972] [0] LOCATION:  ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:17.492 EST [11208] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.492 EST [11208] [0] LOCATION:  ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:17.548 EST [13236] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.548 EST [13236] [0] LOCATION:  ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:17.607 EST [25756] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.607 EST [25756] [0] LOCATION:  ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:17.677 EST [12944] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.677 EST [12944] [0] LOCATION:  ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:17.737 EST [14712] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.737 EST [14712] [0] LOCATION:  ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:18.104 EST [20964] [0] LOG:  00000: database system was
> not properly shut down; automatic recovery in progress
> 2020-02-04 10:26:18.104 EST [20964] [0] LOCATION:  StartupXLOG, xlog.c:6774
> 2020-02-04 10:26:18.109 EST [20964] [0] LOG:  00000: redo starts at
> 14/52009F08
> 2020-02-04 10:26:18.109 EST [20964] [0] LOCATION:  StartupXLOG, xlog.c:7045
> 2020-02-04 10:26:18.349 EST [23064] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:18.349 EST [23064] [0] LOCATION:  ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:19.248 EST [8816] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:19.248 EST [8816] [0] LOCATION:  ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:20.560 EST [18200] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:20.560 EST [18200] [0] LOCATION:  ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:22.508 EST [23204] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:22.508 EST [23204] [0] LOCATION:  ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:25.402 EST [5888] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:25.402 EST [5888] [0] LOCATION:  ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:29.714 EST [16820] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:29.714 EST [16820] [0] LOCATION:  ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:36.161 EST [24072] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:36.161 EST [24072] [0] LOCATION:  ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:45.806 EST [22000] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:45.806 EST [22000] [0] LOCATION:  ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:55.687 EST [20964] [0] LOG:  00000: redo done at
> 14/79A030E0
> 2020-02-04 10:26:55.687 EST [20964] [0] LOCATION:  StartupXLOG, xlog.c:7307
> 2020-02-04 10:26:55.861 EST [16700] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:55.861 EST [16700] [0] LOCATION:  ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:57.016 EST [20788] [0] LOG:  00000: database system is
> ready to accept connections
>
> On Tue, Feb 4, 2020 at 10:50 AM Doug Roberts <h205881(at)gmail(dot)com
> <mailto:h205881(at)gmail(dot)com>> wrote:
>
> Here is a stacktrace with what happened before and after the crash.
>
> 2020-02-04 10:26:16.841 EST [20788] [0] LOG:  00000: server process
> (PID 12168) was terminated by exception 0xC0000005
> 2020-02-04 10:26:16.841 EST [20788] [0] DETAIL:  Failed process was
> running: select CONTAINERS_RESET_RECIRC_BY_DP(3000)
> 2020-02-04 10:26:16.841 EST [20788] [0] HINT:  See C include file
> "ntstatus.h" for a description of the hexadecimal value.
> 2020-02-04 10:26:16.841 EST [20788] [0] LOCATION:  LogChildExit,
> postmaster.c:3670
> 2020-02-04 10:26:16.841 EST [20788] [0] LOG:  00000: terminating any
> other active server processes
> 2020-02-04 10:26:16.841 EST [20788] [0] LOCATION:  HandleChildCrash,
> postmaster.c:3400
> 2020-02-04 10:26:16.873 EST [1212] [0] WARNING:  57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.873 EST [1212] [0] DETAIL:  The postmaster has
> commanded this server process to roll back the current transaction
> and exit, because another server process exited abnormally and
> possibly corrupted shared memory.
> 2020-02-04 10:26:16.873 EST [1212] [0] HINT:  In a moment you should
> be able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.873 EST [1212] [0] LOCATION:  quickdie,
> postgres.c:2717
> 2020-02-04 10:26:16.873 EST [19436] [0] WARNING:  57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.873 EST [19436] [0] DETAIL:  The postmaster has
> commanded this server process to roll back the current transaction
> and exit, because another server process exited abnormally and
> possibly corrupted shared memory.
> 2020-02-04 10:26:16.873 EST [19436] [0] HINT:  In a moment you
> should be able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.873 EST [19436] [0] LOCATION:  quickdie,
> postgres.c:2717
> 2020-02-04 10:26:16.874 EST [13428] [0] WARNING:  57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.874 EST [13428] [0] DETAIL:  The postmaster has
> commanded this server process to roll back the current transaction
> and exit, because another server process exited abnormally and
> possibly corrupted shared memory.
> 2020-02-04 10:26:16.874 EST [13428] [0] HINT:  In a moment you
> should be able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.874 EST [13428] [0] CONTEXT:  while locking
> tuple (0,115) in relation "containers"
> SQL statement "UPDATE containers
>            SET type_uid = COALESCE(declared_type_uid, type_uid),
>                carton_type_uid = COALESCE(declared_carton_type_uid,
> carton_type_uid),
>                status_uid = COALESCE(declared_status_uid, status_uid),
>                order_uid = COALESCE(in_order_uid, order_uid),
>                wave_uid = COALESCE(in_wave_uid, wave_uid),
>                length = COALESCE(in_length, carton_length, length),
>                width = COALESCE(in_width, carton_width, width),
>                height = COALESCE(in_height, carton_height, height),
>                weight = COALESCE(in_weight, weight),
>                weight_minimum = COALESCE(in_weight_minimum,
> weight_minimum),
>                weight_maximum = COALESCE(in_weight_maximum,
> weight_maximum),
>                weight_expected = COALESCE(in_weight_expected,
> weight_expected),
>                first_seen_DP_id = COALESCE(first_seen_DP_id,
> in_last_seen_DP_id),
>                first_seen_datetime = COALESCE(first_seen_datetime,
> last_seen_date_time),
>                last_seen_DP_id = COALESCE(in_last_seen_DP_id,
> last_seen_DP_id),
>                last_seen_datetime = COALESCE(last_seen_date_time,
> last_seen_datetime),
>                recirculation_count =
> COALESCE(in_recirculation_count, recirculation_count),
>                project_flags = COALESCE(in_project_flags,
> project_flags),
>                passed_weight_check =
> COALESCE(in_passed_weight_check, passed_weight_check)
>            WHERE uid = in_uid"
> PL/pgSQL function
> containers_add_update(integer,integer,integer,integer,integer,integer,double
> precision,double precision,double precision,double precision,double
> precision,double precision,double precision,integer,timestamp
> without time zone,character varying,bigint,boolean) line 60 at SQL
> statement
> 2020-02-04 10:26:16.874 EST [13428] [0] LOCATION:  quickdie,
> postgres.c:2717
> 2020-02-04 10:26:16.874 EST [25916] [0] WARNING:  57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.874 EST [25916] [0] DETAIL:  The postmaster has
> commanded this server process to roll back the current transaction
> and exit, because another server process exited abnormally and
> possibly corrupted shared memory.
> 2020-02-04 10:26:16.874 EST [25916] [0] HINT:  In a moment you
> should be able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.874 EST [25916] [0] CONTEXT:  while locking
> tuple (1,91) in relation "containers"
> SQL statement "UPDATE containers
>            SET type_uid = COALESCE(declared_type_uid, type_uid),
>                carton_type_uid = COALESCE(declared_carton_type_uid,
> carton_type_uid),
>                status_uid = COALESCE(declared_status_uid, status_uid),
>                order_uid = COALESCE(in_order_uid, order_uid),
>                wave_uid = COALESCE(in_wave_uid, wave_uid),
>                length = COALESCE(in_length, carton_length, length),
>                width = COALESCE(in_width, carton_width, width),
>                height = COALESCE(in_height, carton_height, height),
>                weight = COALESCE(in_weight, weight),
>                weight_minimum = COALESCE(in_weight_minimum,
> weight_minimum),
>                weight_maximum = COALESCE(in_weight_maximum,
> weight_maximum),
>                weight_expected = COALESCE(in_weight_expected,
> weight_expected),
>                first_seen_DP_id = COALESCE(first_seen_DP_id,
> in_last_seen_DP_id),
>                first_seen_datetime = COALESCE(first_seen_datetime,
> last_seen_date_time),
>                last_seen_DP_id = COALESCE(in_last_seen_DP_id,
> last_seen_DP_id),
>                last_seen_datetime = COALESCE(last_seen_date_time,
> last_seen_datetime),
>                recirculation_count =
> COALESCE(in_recirculation_count, recirculation_count),
>                project_flags = COALESCE(in_project_flags,
> project_flags),
>                passed_weight_check =
> COALESCE(in_passed_weight_check, passed_weight_check)
>            WHERE uid = in_uid"
> PL/pgSQL function
> containers_add_update(integer,integer,integer,integer,integer,integer,double
> precision,double precision,double precision,double precision,double
> precision,double precision,double precision,integer,timestamp
> without time zone,character varying,bigint,boolean) line 60 at SQL
> statement
> 2020-02-04 10:26:16.874 EST [25916] [0] LOCATION:  quickdie,
> postgres.c:2717
> 2020-02-04 10:26:16.875 EST [2512] [0] WARNING:  57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.875 EST [2512] [0] DETAIL:  The postmaster has
> commanded this server process to roll back the current transaction
> and exit, because another server process exited abnormally and
> possibly corrupted shared memory.
> 2020-02-04 10:26:16.875 EST [2512] [0] HINT:  In a moment you should
> be able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.875 EST [2512] [0] CONTEXT:  while locking tuple
> (0,111) in relation "containers"
> SQL statement "UPDATE containers
>            SET type_uid = COALESCE(declared_type_uid, type_uid),
>                carton_type_uid = COALESCE(declared_carton_type_uid,
> carton_type_uid),
>                status_uid = COALESCE(declared_status_uid, status_uid),
>                order_uid = COALESCE(in_order_uid, order_uid),
>                wave_uid = COALESCE(in_wave_uid, wave_uid),
>                length = COALESCE(in_length, carton_length, length),
>                width = COALESCE(in_width, carton_width, width),
>                height = COALESCE(in_height, carton_height, height),
>                weight = COALESCE(in_weight, weight),
>                weight_minimum = COALESCE(in_weight_minimum,
> weight_minimum),
>                weight_maximum = COALESCE(in_weight_maximum,
> weight_maximum),
>                weight_expected = COALESCE(in_weight_expected,
> weight_expected),
>                first_seen_DP_id = COALESCE(first_seen_DP_id,
> in_last_seen_DP_id),
>                first_seen_datetime = COALESCE(first_seen_datetime,
> last_seen_date_time),
>                last_seen_DP_id = COALESCE(in_last_seen_DP_id,
> last_seen_DP_id),
>                last_seen_datetime = COALESCE(last_seen_date_time,
> last_seen_datetime),
>                recirculation_count =
> COALESCE(in_recirculation_count, recirculation_count),
>                project_flags = COALESCE(in_project_flags,
> project_flags),
>                passed_weight_check =
> COALESCE(in_passed_weight_check, passed_weight_check)
>            WHERE uid = in_uid"
> PL/pgSQL function
> containers_add_update(integer,integer,integer,integer,integer,integer,double
> precision,double precision,double precision,double precision,double
> precision,double precision,double precision,integer,timestamp
> without time zone,character varying,bigint,boolean) line 60 at SQL
> statement
> 2020-02-04 10:26:16.875 EST [2512] [0] LOCATION:  quickdie,
> postgres.c:2717
> 2020-02-04 10:26:16.879 EST [14908] [0] WARNING:  57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.879 EST [14908] [0] DETAIL:  The postmaster has
> commanded this server process to roll back the current transaction
> and exit, because another server process exited abnormally and
> possibly corrupted shared memory.
> 2020-02-04 10:26:16.879 EST [14908] [0] HINT:  In a moment you
> should be able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.879 EST [14908] [0] LOCATION:  quickdie,
> postgres.c:2717
> 2020-02-04 10:26:16.880 EST [7092] [0] WARNING:  57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.880 EST [7092] [0] DETAIL:  The postmaster has
> commanded this server process to roll back the current transaction
> and exit, because another server process exited abnormally and
> possibly corrupted shared memory.
> 2020-02-04 10:26:16.880 EST [7092] [0] HINT:  In a moment you should
> be able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.880 EST [7092] [0] LOCATION:  quickdie,
> postgres.c:2717
> 2020-02-04 10:26:16.975 EST [14360] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:16.975 EST [14360] [0] LOCATION:
>  ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:17.033 EST [20788] [0] LOG:  00000: all server
> processes terminated; reinitializing
> 2020-02-04 10:26:17.033 EST [20788] [0] LOCATION:
>  PostmasterStateMachine, postmaster.c:3912
> 2020-02-04 10:26:17.105 EST [20964] [0] LOG:  00000: database system
> was interrupted; last known up at 2020-02-04 10:26:09 EST
> 2020-02-04 10:26:17.105 EST [20964] [0] LOCATION:  StartupXLOG,
> xlog.c:6277
> 2020-02-04 10:26:17.115 EST [1668] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.115 EST [1668] [0] LOCATION:
>  ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:17.179 EST [25800] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.179 EST [25800] [0] LOCATION:
>  ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:17.301 EST [14700] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.301 EST [14700] [0] LOCATION:
>  ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:17.309 EST [19060] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.309 EST [19060] [0] LOCATION:
>  ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:17.378 EST [24772] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.378 EST [24772] [0] LOCATION:
>  ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:17.434 EST [12972] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.434 EST [12972] [0] LOCATION:
>  ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:17.492 EST [11208] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.492 EST [11208] [0] LOCATION:
>  ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:17.548 EST [13236] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.548 EST [13236] [0] LOCATION:
>  ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:17.607 EST [25756] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.607 EST [25756] [0] LOCATION:
>  ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:17.677 EST [12944] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.677 EST [12944] [0] LOCATION:
>  ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:17.737 EST [14712] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.737 EST [14712] [0] LOCATION:
>  ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:18.104 EST [20964] [0] LOG:  00000: database system
> was not properly shut down; automatic recovery in progress
> 2020-02-04 10:26:18.104 EST [20964] [0] LOCATION:  StartupXLOG,
> xlog.c:6774
> 2020-02-04 10:26:18.109 EST [20964] [0] LOG:  00000: redo starts at
> 14/52009F08
> 2020-02-04 10:26:18.109 EST [20964] [0] LOCATION:  StartupXLOG,
> xlog.c:7045
> 2020-02-04 10:26:18.349 EST [23064] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:18.349 EST [23064] [0] LOCATION:
>  ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:19.248 EST [8816] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:19.248 EST [8816] [0] LOCATION:
>  ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:20.560 EST [18200] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:20.560 EST [18200] [0] LOCATION:
>  ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:22.508 EST [23204] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:22.508 EST [23204] [0] LOCATION:
>  ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:25.402 EST [5888] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:25.402 EST [5888] [0] LOCATION:
>  ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:29.714 EST [16820] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:29.714 EST [16820] [0] LOCATION:
>  ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:36.161 EST [24072] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:36.161 EST [24072] [0] LOCATION:
>  ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:45.806 EST [22000] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:45.806 EST [22000] [0] LOCATION:
>  ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:55.687 EST [20964] [0] LOG:  00000: redo done at
> 14/79A030E0
> 2020-02-04 10:26:55.687 EST [20964] [0] LOCATION:  StartupXLOG,
> xlog.c:7307
> 2020-02-04 10:26:55.861 EST [16700] [0] FATAL:  57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:55.861 EST [16700] [0] LOCATION:
>  ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:57.016 EST [20788] [0] LOG:  00000: database system
> is ready to accept connections
>
> On Tue, Feb 4, 2020 at 9:20 AM Doug Roberts <h205881(at)gmail(dot)com
> <mailto:h205881(at)gmail(dot)com>> wrote:
>
> > So how did containers_reset_recirc() come to clash with
> > containers_add_update()?
>
> They are clashing because another portion of our system is
> running and updating containers. The reset recirc function was
> run at the same time to see how our system and the database
> would handle it.
>
> The recirc string is formatted like 2000=3,1000=6,5000=0. So the
> reset recirc function with take a UID (1000 for example) and use
> that to remove 1000=x from all of the recirc counts for all of
> the containers that have 1000=x.
>
> We are currently using PG 12.0.
>
> Thanks,
>
> Doug
>
> On Mon, Feb 3, 2020 at 6:21 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us
> <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>> wrote:
>
> Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com
> <mailto:adrian(dot)klaver(at)aklaver(dot)com>> writes:
> > Please reply to list also.
>
> > On 2/3/20 2:18 PM, Doug Roberts wrote:
> >> Here is what the reset recirc function is doing.
> >> ...
> >>     UPDATE containers
> >> ...
>
> > So how did containers_reset_recirc() come to clash with
> > containers_add_update()?
>
> If this is PG 12.0 or 12.1, a likely theory is that this is an
> EvalPlanQual bug (which'd be triggered during concurrent updates
> of the same row in the table, so that squares with the
> observation
> that locking the table prevents it).  The known bugs in that
> area
> require either before-row-update triggers on the table, or
> child tables (either partitioning or traditional inheritance).
> So I wonder what the schema of table "containers" looks like.
>
> Or you could have hit some new bug ... but there's not enough
> info here to diagnose.
>
>                         regards, tom lane
>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Doug Roberts 2020-02-04 16:27:14 Re: Postgres Crashing
Previous Message Doug Roberts 2020-02-04 16:06:50 Re: Postgres Crashing