Re: Postgres will not allow new connections, suspended process, waiting error

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Prateek Mahajan <prateekm99(at)gmail(dot)com>
Cc: pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Postgres will not allow new connections, suspended process, waiting error
Date: 2017-07-02 16:07:36
Message-ID: CABUevEx_3h4T+9AuzgCHm+78qudwZjYDdw2kLpJiFjWQv9EMqg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Sat, Jul 1, 2017 at 12:59 AM, Prateek Mahajan <prateekm99(at)gmail(dot)com>
wrote:

> More details.
>
> Environment
> PostgreSQL 9.5, EnterpriseDB Postgres installer
> Windows Server 2012R2 with Active Directory
> Symantec End Point Protection
>
> Symptom:
>
> After about 1 week of running, one of PostgreSQL process (postgres.exe)
> showed "suspended" in task manager, and I cannot kill it in the task
> manager ("Access Denied" error message appeared). This "suspended" process
> was not the master PID as indicated in postmaster.pid file.
> Current live connections still work but one cannot establish new
> connections. The only solution that I have is to restart the Server
> Other information:
>
> The PostgreSQL service is run under a domain account.
> The maximum connection was never reached as it is set as 1000 and we only
> had about 10 connections.
> There was plenty of available memory there. The total memory is 288GB and
> only 8% was used
> There were minimum hard drive activities as it occurred. The C drive where
> PostgreSQL was installed had about 86GB of free space.
> There are additional 4 table spaces that are not on C drive but spread
> over 4 hard drives. Each of 4 hard drives has more than 500GB of space.
> we have been using the same configuration files for years and the same
> file is also used on a second PostgreSQL server, which does not have the
> issue at all.
> The PostgreSQL logs had something like this when this happened and it
> continues to produce this warning message every minute or so:
>
> 2017-06-28 19:40:21 CDT WARNING: worker took too long to start; canceled
> 2017-06-28 19:41:21 CDT WARNING: worker took too long to start; canceled
> 2017-06-28 19:42:21 CDT WARNING: worker took too long to start; canceled
> 2017-06-28 19:43:21 CDT WARNING: worker took too long to start; canceled
>
>
Those are autovacuum workers trying to start. My guess is that's a symptom
of the same basic problem, which is that your machine behaves as if it's
heavily overloaded.

As a first try I'd attempt removing the Symantec Endpoint stuff and see if
that helps. It's very common that software like that breaks the database.
And being unable to kill things in the task manager clearly indicates the
problem lies outside the control of Postgres.

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Goldsmith, Christopher [ASM Research] 2017-07-03 12:47:08 Re: Postgres vs EnterpriseDB Vulnerability scans with Nessus
Previous Message David G. Johnston 2017-06-30 23:06:56 Re: Postgres will not allow new connections, suspended process, waiting error