Re: VACUUM unable to accomplish because of a non-existent MultiXactId

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Kouber Saparev <kouber(at)gmail(dot)com>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: VACUUM unable to accomplish because of a non-existent MultiXactId
Date: 2015-11-27 18:49:36
Message-ID: 20151127184936.GD4320@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Kouber Saparev wrote:
> Hello,
>
> Recently I spotted an auto vacuum that is constantly being run over a pg_toast table in the database. Interestingly the underlying table did not have that amount of writes that would trigger the auto vacuum every minute.
>
> When I tried to run the VACUUM manually it died with a non-existent MultiXactId error:
>
> db=# vacuum analyze verbose pg_toast.pg_toast_376621;
> INFO: vacuuming "pg_toast.pg_toast_376621"
> ERROR: MultiXactId 2915905228 does no longer exist -- apparent wraparound
>
> db=# select txid_current();
> txid_current
> --------------
> 2583853583
> (1 row)
>
> db=# select datfrozenxid from pg_database where datname=‘db';
> datfrozenxid
> --------------
> 2161848861
> (1 row)
>
>
> We did a pg_upgrade about a month ago to upgrade from 9.2 to 9.4, so I guess there might be some relation, as I do not remember having that issue before the upgrade.
>
> The question is, how can I clean the table in question and finally be able to run a vacuum over it?

Hmm, I don't think there can be multixacts in toast tables at all,
normally. SELECT FOR UPDATE fails on a toast table, so I don't see a
mechanism for this to happen at all. Maybe some odd corner case in 9.2
that left things in a strange state.

The state of your data is probably caused by some weird corner case of
the upgrade. Can you see in the log files that the toast table has been
failing vacuum since the upgrade, or is it more recent than that? (In
other words, is there any working vacuum after the upgrade?)

It would be useful to debug this that you attached gdb to a backend, set
breakpoint on errfinish, then run vacuum on that table. Then you can
extract the page number from the backtrace. With the page number we can
try pageinspect and heap_page_items until we find the culprit and
perhaps identify how it got in that state.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Tom Lane 2015-11-27 19:33:48 Re: VACUUM unable to accomplish because of a non-existent MultiXactId
Previous Message Kouber Saparev 2015-11-27 18:25:49 VACUUM unable to accomplish because of a non-existent MultiXactId