FATAL: terminating connection due to administrator command

From: Alban Hertroys <alban(dot)hertroys(at)apollovredestein(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: FATAL: terminating connection due to administrator command
Date: 2020-10-01 09:17:17
Message-ID: OFACE0CA5D.70F6B677-ONC12585F4.002E818B-C12585F4.0032E67C@apollotyres.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi all,

We're seeing the FATAL error message from the subject pop up in our logs
at regular intervals, but I haven't been able to pinpoint what is causing
it. I'm hoping for some insights here.

We run a PostgreSQL 11.9 server on CentOS 7, within a vmware environment:
PostgreSQL 11.9 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
20150623 (Red Hat 4.8.5-39), 64-bit

The package was installed from the PGDG repository.

I'm not even sure I should be worried, there doesn't appear to be any
impact on the servers' functioning, but it does say 'FATAL'.
What we're seeing are lines like these two instances:

2020-09-30 21:45:09.999 CEST [2375] 172.30.2.25 asdf STATEMENT: select
count(*) from "util_asdf"."v_something_something2"
2020-09-30 21:45:15.018 CEST [2375] 172.30.2.25 asdf ERROR: canceling
statement due to user request
2020-09-30 21:45:15.018 CEST [2375] 172.30.2.25 asdf STATEMENT: select
count(*) from "dm_asdf"."asdf_func"
2020-09-30 21:45:20.027 CEST [2375] 172.30.2.25 asdf ERROR: canceling
statement due to user request
2020-09-30 21:45:20.027 CEST [2375] 172.30.2.25 asdf STATEMENT: select
count(*) from "dm_asdf"."asdf_func"
2020-09-30 22:27:51.422 CEST [20270] 172.30.2.26 selfservice_sales ERROR:
canceling statement due to user request
2020-09-30 22:27:51.422 CEST [20270] 172.30.2.26 selfservice_sales
STATEMENT: select count(*) from "dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:51.422 CEST [30649] FATAL: terminating connection due
to administrator command
2020-09-30 22:27:51.422 CEST [30649] STATEMENT: select count(*) from
"dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:51.422 CEST [30648] FATAL: terminating connection due
to administrator command
2020-09-30 22:27:51.422 CEST [30648] STATEMENT: select count(*) from
"dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:51.422 CEST [30647] FATAL: terminating connection due
to administrator command
2020-09-30 22:27:51.422 CEST [30647] STATEMENT: select count(*) from
"dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:51.422 CEST [30646] FATAL: terminating connection due
to administrator command
2020-09-30 22:27:51.422 CEST [30646] STATEMENT: select count(*) from
"dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:51.422 CEST [30645] FATAL: terminating connection due
to administrator command
2020-09-30 22:27:51.422 CEST [30645] STATEMENT: select count(*) from
"dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:51.435 CEST [6482] LOG: background worker "parallel
worker" (PID 30646) exited with exit code 1
2020-09-30 22:27:51.435 CEST [6482] LOG: background worker "parallel
worker" (PID 30645) exited with exit code 1
2020-09-30 22:27:51.435 CEST [6482] LOG: background worker "parallel
worker" (PID 30647) exited with exit code 1
2020-09-30 22:27:51.435 CEST [6482] LOG: background worker "parallel
worker" (PID 30648) exited with exit code 1
2020-09-30 22:27:51.435 CEST [6482] LOG: background worker "parallel
worker" (PID 30649) exited with exit code 1
2020-09-30 22:27:56.446 CEST [20270] 172.30.2.26 selfservice_sales ERROR:
canceling statement due to user request
2020-09-30 22:27:56.446 CEST [20270] 172.30.2.26 selfservice_sales
STATEMENT: select count(*) from "dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.446 CEST [30659] FATAL: terminating connection due
to administrator command
2020-09-30 22:27:56.446 CEST [30659] STATEMENT: select count(*) from
"dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.446 CEST [30658] FATAL: terminating connection due
to administrator command
2020-09-30 22:27:56.446 CEST [30658] STATEMENT: select count(*) from
"dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.446 CEST [30657] FATAL: terminating connection due
to administrator command
2020-09-30 22:27:56.446 CEST [30657] STATEMENT: select count(*) from
"dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.446 CEST [30656] FATAL: terminating connection due
to administrator command
2020-09-30 22:27:56.446 CEST [30656] STATEMENT: select count(*) from
"dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.446 CEST [30655] FATAL: terminating connection due
to administrator command
2020-09-30 22:27:56.446 CEST [30655] STATEMENT: select count(*) from
"dm_b2b"."prlwytzkofskiv1"
2020-09-30 22:27:56.459 CEST [6482] LOG: background worker "parallel
worker" (PID 30655) exited with exit code 1
2020-09-30 22:27:56.459 CEST [6482] LOG: background worker "parallel
worker" (PID 30656) exited with exit code 1
2020-09-30 22:27:56.459 CEST [6482] LOG: background worker "parallel
worker" (PID 30657) exited with exit code 1
2020-09-30 22:27:56.459 CEST [6482] LOG: background worker "parallel
worker" (PID 30658) exited with exit code 1
2020-09-30 22:27:56.459 CEST [6482] LOG: background worker "parallel
worker" (PID 30659) exited with exit code 1
2020-09-30 22:43:08.459 CEST [8055] 172.30.2.25 selfservice_prd ERROR:
schema "somethingelse" does not exist at character 71

Apparently, something is sending SIGTERM to our pg processes. I know that
I'm not doing that, certainly not at those hours, and I'm the one who set
up this system and am the only DBA of it.

Advice I found on the Internet is to use systemtap with some tap-script,
but the scripts that I found just displayed the PID's of processes without
telling me their names, which I didn't find all that useful in figuring
out who was responsible, so I made an attempt (I have no experience with
stap) at modifying it to print process names of signal sender and target:

/*
* killsnoop-nd.stp Trace process signals.
* For Linux, uses SystemTap (non-debuginfo).
*
* Copyright (C) 2015 Brendan Gregg.
(etc)
*/
global target;
global signal;

probe begin
{
printf("%-6s %-12s %-5s %-6s %6s\n", "FROM", "COMMAND", "SIG",
"TO", "COMMAND");
}

probe nd_syscall.kill
{
target[tid()] = uint_arg(1);
signal[tid()] = uint_arg(2);
}

probe nd_syscall.kill.return
{
if (signal[tid()] == 15 && target[tid()] != 0) {
printf("%-6d %-12s %-5d %-6d %12s\n"
, pid(), execname()
, signal[tid()]
, target[tid()],
pid2execname(target[tid()]));
}
delete target[tid()];
delete signal[tid()];
}

The output of last night was:

FROM COMMAND SIG TO COMMAND
30068 systemd-udevd 15 14151 systemd-udevd
30068 systemd-udevd 15 14836 systemd-udevd
(...)
6482 postmaster 15 30649 postmaster
6482 postmaster 15 30648 postmaster
6482 postmaster 15 30647 postmaster
6482 postmaster 15 30646 postmaster
6482 postmaster 15 30645 postmaster
6482 postmaster 15 30659 postmaster
6482 postmaster 15 30658 postmaster
6482 postmaster 15 30657 postmaster
6482 postmaster 15 30656 postmaster
6482 postmaster 15 30655 postmaster
6482 postmaster 15 2065 postmaster
6482 postmaster 15 2064 postmaster
6482 postmaster 15 2063 postmaster

Several of these TO-pid's match those in the PG log.

$ ps aux | grep 6482
postgres 6482 0.0 0.5 51755776 1043872 ? Ss Sep28 3:25
/usr/pgsql-11/bin/postmaster -D /data/11/data

Am I correct in concluding that postmaster is killing its own processes?
If so, what is going on here? And more importantly, what do I do about it?

Regards,
Alban Hertroys.

P.S. I'm mailing from my work account, as this is work related. You may
remember me from my private account.

Alban Hertroys
D: +31 (0)53 4 888 888 | T: +31 (0)53 4888 888 | E: alban(dot)hertroys(at)apollovredestein(dot)com
Apollo Vredestein B.V.| Ir. E.L.C. Schiffstraat 370, 7547 RD Enschede, The
Netherlands
Chamber of Commerce number: 34223268




The information contained in this e-mail is intended solely for the use of the
individual or entity to whom it is addressed. If you are not the intended
recipient, you are hereby notified that any disclosure, copying, distribution
or action in relation to the contents of this information is strictly
prohibited and may be unlawful and request you to delete this message and any
attachments and advise the sender by return e-mail. The confidentiality of this
message is not warranted. Apollo Vredestein and its subsidiaries rule out any
and every liability resulting from this or any other electronic transmission



Please consider the environment before printing this e-mail

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Srinivasa T N 2020-10-01 09:47:33 Re: FATAL: terminating connection due to administrator command
Previous Message Matthias Apitz 2020-10-01 06:36:55 Re: Problem close curser after rollback