Re: Missing Chunk Error when doing a VACUUM FULL operation - DB Corruption?

From: Arjun Ranade <ranade(at)nodalexchange(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-admin(at)postgresql(dot)org
Subject: Re: Missing Chunk Error when doing a VACUUM FULL operation - DB Corruption?
Date: 2017-11-02 21:17:43
Message-ID: CANrrCRxk=26f+UaaQBv8BKFk6skH4FCEA+Q_EV-oE2tYaW4s6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Yes, we are now in the process of adding custom metrics/alerts around the
xmin horizon across all of our postgres databases.

We will do a DB-wide VACUUM FULL as well (ironically, this incident started
because VACUUM FULL failed last weekend).

Appreciate all the input on this.

Arjun

On Thu, Nov 2, 2017 at 11:06 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:

> Tom, Arjun,
>
> * Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> > Arjun Ranade <ranade(at)nodalexchange(dot)com> writes:
> > > After dropping the replication slot, VACUUM FULL runs fine now and no
> > > longer reports the "oldest xmin is far in the past"
> >
> > Excellent. Maybe we should think about providing better tools to notice
> > "stuck" replication slots.
>
> +1
>
> > In the meantime, you probably realize this already, but if global xmin
> > has been stuck for months then you're going to have terrible bloat
> > everywhere. Database-wide VACUUM FULL seems called for.
>
> This, really, is also a lesson in "monitor your distance to transaction
> wrap-around".. You really should know something is up a lot sooner than
> the warnings from PG showing up in the logs.
>
> Thanks!
>
> Stephen
>

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Vasilis Ventirozos 2017-11-02 21:27:10 Re: postgresql9.4 aws - no pg_upgrade
Previous Message bala jayaram 2017-11-02 21:03:46 Fwd: postgresql9.4 aws - no pg_upgrade