Re: Finding out the aging tables for vacuuming to prevent data loss on transaction id wraparound

From: Subramanian Krishnan <subramanian(dot)kris(at)gmail(dot)com>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Cc: pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: Re: Finding out the aging tables for vacuuming to prevent data loss on transaction id wraparound
Date: 2018-03-06 11:54:18
Message-ID: CAPgf7=i_s6oJcZm7ak1QTggZNqLqgsjt5AsUiLZ7CoSrkLD5aw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hello Laurenz,

Thank you for taking the time out to read through and problem description
and responding back.

After posting the question on the forum, I continued investigating possible
root causes. Through a flash of intuition I suspected if it could be a
table which is strictly not a part of the database (not listed in pg_class)
which could be contributing factor.

pg_shdepend fitted the bill. So I did a vacuum on pg_catalog.pg_shdepend
and post that re-ran the DB age query. The trick worked and the age came
down to match the max age entry in pg_class. To be on the safer side, I did
vacuuming of pg_shdescription and pg_shseclabel as well.

Your suggestion of:
1) Making AV more aggressive and
2) Tiding over the present problem of aged TOAST tables though
manual/scripted VACUUM is valid and we are working on that currently.

Thanks and Regards,
Subu

On Mon, Mar 5, 2018 at 2:54 PM, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
wrote:

> Subramanian Krishnan wrote:
> > We are running PostgreSQL 9.4.14 and recently we received an alert for
> transaction id
> > crossing the 1 billion mark. Since we can hit the transaction id
> wraparound issue for aging tables,
> > we investigated by running the following queries:
> >
> > mpsdb=> select * from txid_current();
> > txid_current
> > --------------
> > 1100826671
> > (1 row)
> >
> > mpsdb=> SELECT datname, age(datfrozenxid) FROM pg_database ORDER BY 2
> DESC LIMIT 20;
> > datname | age
> > -----------+------------
> > mpsdb | 1087909867
> > rdsadmin | 1000562517
> > postgres | 950697143
> > template0 | 900709601
> > template1 | 900697643
> > (5 rows)
> >
> > mpsdb=> SELECT c.relname as table_name, c.relkind as type,
> age(c.relfrozenxid) as age, c.relfrozenxid FROM pg_class AS c WHERE
> age(c.relfrozenxid) <> 2147483647 ORDER BY 3 DESC LIMIT 20;
> > table_name | type | age | relfrozenxid
> > -------------------+------+------------+--------------
> > pg_toast_22985917 | t | 1042877986 | 57941433
> > pg_toast_22986301 | t | 1042877877 | 57941542
> > pg_toast_23823161 | t | 1041791836 | 59027583
> > pg_toast_23823242 | t | 1041791776 | 59027643
> > pg_toast_23904594 | t | 1041658236 | 59161183
> > pg_toast_23904629 | t | 1041658221 | 59161198
> > pg_toast_23904640 | t | 1041658208 | 59161211
> > pg_toast_23904648 | t | 1041658200 | 59161219
> > pg_toast_23904658 | t | 1041658191 | 59161228
> > pg_toast_23904666 | t | 1041658183 | 59161236
> > pg_toast_23904674 | t | 1041658175 | 59161244
> > pg_toast_23904682 | t | 1041658169 | 59161250
> > pg_toast_23904690 | t | 1041658161 | 59161258
> > pg_toast_23904698 | t | 1041658156 | 59161263
> > pg_toast_23904706 | t | 1041658149 | 59161270
> > pg_toast_23904714 | t | 1041658129 | 59161290
> > pg_toast_23904722 | t | 1041658096 | 59161323
> > pg_toast_23904730 | t | 1041658090 | 59161329
> > pg_toast_23904738 | t | 1041658085 | 59161334
> > pg_toast_23904746 | t | 1041658077 | 59161342
> > (20 rows)
> >
> >
> > mpsdb=> SELECT datname, usename, pid, waiting, current_timestamp -
> xact_start AS xact_runtime, query FROM pg_stat_activity WHERE upper(query)
> like '%VACUUM%' ORDER BY xact_start;
> > datname | usename | pid | waiting | xact_runtime |
> query
> > ---------+----------+-------+---------+-----------------+---
> ---------------------------------------------------------------------
> > mpsdb | rdsadmin | 13833 | f | 00:00:14.161262 | autovacuum:
> VACUUM pg_toast.pg_toast_242649675 (to prevent wraparound)
> > mpsdb | rdsadmin | 14565 | f | 00:00:11.735571 | autovacuum:
> VACUUM pg_toast.pg_toast_242649757 (to prevent wraparound)
> > mpsdb | rdsadmin | 15315 | f | 00:00:09.303762 | autovacuum:
> VACUUM pg_toast.pg_toast_242649819 (to prevent wraparound)
> > mpsdb | rdsadmin | 14935 | f | 00:00:06.893078 | autovacuum:
> VACUUM pg_toast.pg_toast_243226657 (to prevent wraparound)
> > mpsdb | rdsadmin | 15851 | f | 00:00:04.322474 | autovacuum:
> VACUUM pg_toast.pg_toast_243227582 (to prevent wraparound)
> > mpsdb | rdsadmin | 15615 | f | 00:00:01.768495 | autovacuum:
> VACUUM pg_toast.pg_toast_243226332 (to prevent wraparound)
> >
> > It is clear that we need to vacuum the most aged pg_toast tables which
> are not yet picked up by the autovacuum process.
> >
> > The question which is bothering us though is:
> > As per the queries we ran the most aged transaction and table is
> 1042877986 (pg_toast_22985917).
> > And as per PostgreSQL pg_database documentation the most aged
> table/transaction in pg_class determines
> > age of the database (via the datfrozenxid value). If that is the case we
> would have expected the age
> > of mpsdb database to be 1042877986 (or in this range) but what we see as
> age is 1087909867.
> >
> > Does this mean there are other aging tables we are not aware of? If yes,
> how do we get to those
> > tables since we have already queried pg_class for the most aged table?
> >
> > Request any help/guidance we can get regarding this.
>
> That is indeed strange.
>
> Reading the code, I see that at the end of a VACUUM, "datfrozenxid" is set
> to the minimum value of all "relfrozenxid" for all tables in the database
> obtained with a sequential scan of "pg_class".
>
> The only exception is that if a table is found where "relfrozenxid" is in
> the
> future, nothing is done. The comment suggests that that should normally
> not
> happen, but "has been known to arise due to bugs in pg_upgrade".
>
> - Can you find tables with negative "relfrozenxid" age in "pg_class"?
>
> - The other option is that a VACUUM finished between the query of
> "pg_database"
> and the query of "pg_class". Is that an option?
>
> You should tune autovacuum to be more aggressive so that it gets done
> processing
> the tables. To get rid of the immediate problem, you could schedule a
> manual
> VACUUM of the tables to which the TOAST table belong.
>
> Yours,
> Laurenz Albe
>

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message David Steele 2018-03-06 13:13:41 Re: future of pgaudit
Previous Message Achilleas Mantzios 2018-03-06 11:00:41 future of pgaudit