Re: Missing Chunk Error when doing a VACUUM FULL operation - DB Corruption?

From: Arjun Ranade <ranade(at)nodalexchange(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Missing Chunk Error when doing a VACUUM FULL operation - DB Corruption?
Date: 2017-10-31 21:27:41
Message-ID: CANrrCRy-qdDMX=a=xQsC9bbv36kD5biKUK3RgpNxUO9M-VfcTA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Yes, it is still reproducible. When I try to VACUUM FULL right now, I get
the same error.

This is postgres 9.4. I am new to gdb, but based on what you said, I have
the following output:

Breakpoint 1, 0x0000000000768be0 in errfinish ()
(gdb) bt
#0 0x0000000000768be0 in errfinish ()
#1 0x000000000076998c in elog_finish ()
#2 0x0000000000495960 in ?? ()
#3 0x0000000000496075 in heap_tuple_fetch_attr ()
#4 0x0000000000496572 in toast_insert_or_update ()
#5 0x0000000000492ce1 in ?? ()
#6 0x0000000000493733 in rewrite_heap_tuple ()
#7 0x000000000053ebdf in ?? ()
#8 0x000000000053f68e in cluster_rel ()
#9 0x0000000000590a1b in ?? ()
#10 0x0000000000590f9f in vacuum ()
#11 0x000000000068ee77 in standard_ProcessUtility ()
#12 0x00007f50c6b583d4 in ?? () from /usr/pgsql-9.4/lib/pglogical.so
#13 0x000000000068bb17 in ?? ()
#14 0x000000000068caad in ?? ()
#15 0x000000000068d132 in PortalRun ()
#16 0x000000000068979e in ?? ()
#17 0x000000000068ae18 in PostgresMain ()
#18 0x0000000000635d69 in PostmasterMain ()
#19 0x00000000005cd248 in main ()
(gdb) cont

On Tue, Oct 31, 2017 at 3:59 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Arjun Ranade <ranade(at)nodalexchange(dot)com> writes:
> > We had some downtime recently and I thought it would be a good idea to
> do a
> > periodic VACUUM FULL of one of our large Postgres DB's.
>
> > However, when I tried to attempt the VACUUM FULL, I saw the following
> error:
> > INFO: vacuuming "pg_catalog.pg_statistic"
> > vacuumdb: vacuuming of database "db1" failed: ERROR: missing chunk
> number
> > 0 for toast value 30382746 in pg_toast_2619
>
> We hear reports like this just often enough to make it seem like there's
> some bug that afflicts pg_statistic specifically. Nobody's ever found
> a cause though.
>
> > Given that pg_statistic is inessential data (it can be rebuilt by
> analyzing
> > each table), I did a 'DELETE FROM pg_statistic;' which removed all the
> > rows.
>
> That would've been my advice ...
>
> > However, when I ran the VACUUM FULL again, I received the same error.
>
> ... but that's really interesting. Is this still reproducible? If you
> could get a stack trace from the point of the error, that might yield
> useful data. (Set a gdb breakpoint at "errfinish", run the VACUUM FULL,
> and when it stops, get the stack with "bt". Don't use VERBOSE, or you'll
> reach errfinish for each line of verbose output...)
>
> Also, what PG version is this exactly?
>
> regards, tom lane
>

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Tom Lane 2017-10-31 21:56:14 Re: Missing Chunk Error when doing a VACUUM FULL operation - DB Corruption?
Previous Message Tom Lane 2017-10-31 19:59:02 Re: Missing Chunk Error when doing a VACUUM FULL operation - DB Corruption?