Re: Auto-vacuum timing out and preventing connections

From: David Johansen <davejohansen(at)gmail(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: Auto-vacuum timing out and preventing connections
Date: 2022-06-28 17:24:23
Message-ID: CAAcYxUcjZSoxf+YuQ1hLcAdCP7Q4_yn2mN2UfhLELSBby9bH1w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, Jun 27, 2022 at 10:42 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:

> Hi,
>
> On Mon, Jun 27, 2022 at 02:38:21PM -0600, David Johansen wrote:
> > We're running into an issue where the database can't be connected to. It
> > appears that the auto-vacuum is timing out and then that prevents new
> > connections from happening. This assumption is based on these logs
> showing
> > up in the logs:
> > WARNING: worker took too long to start; canceled
>
> I don't think that autovacuum is the reason of the problem, but just
> another
> victim of the same problem as the autovacuum launcher is still active and
> tries
> to schedule workers, which can't connect either.
>

Sorry, I should have provided some more details. These logs happen for
12-24 hours before the server stops accepting connections.

> > The log appears about every 5 minutes and eventually nothing can connect
> to
> > it and it has to be rebooted.
>
> Are you saying that you have to reboot every 5 minutes?
>

That error log happens every 5 minutes and that's the nap time.

> Also, do you mean reboot the server or just restarting the postgres
> service is
> enough?
>

Restarting the postgres service.

> > These are the most similarly related previous posts, but the CPU usage
> > isn't high when this happens, so I don't believe that's the problem
> >
> https://www.postgresql.org/message-id/20081105185206.GS4114%40alvh.no-ip.org
> >
> https://www.postgresql.org/message-id/AANLkTinsGLeRc26RT5Kb4_HEhow5e97p0ZBveg=p9xqS@mail.gmail.com
> >
> > What can we do to diagnose this problem and get our database working
> > reliably again?
>
> As mentioned in the 2nd link, getting a strace of the postmaster when the
> problem happens may help.
>

This is running in RDS on AWS, so I don't believe I can do an strace on the
service.

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2022-06-28 18:08:16 Re: BUG #17534: 'tablespace' option crushes 'create database' query with 'permission denied' message
Previous Message PG Bug reporting form 2022-06-28 08:39:21 BUG #17534: 'tablespace' option crushes 'create database' query with 'permission denied' message