Re: why pg_class.relfrozenxid needs to be updated for frozen tables (where all rows have xmin=2)?

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Slava Mudry <slava44(at)gmail(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: why pg_class.relfrozenxid needs to be updated for frozen tables (where all rows have xmin=2)?
Date: 2015-01-31 01:28:26
Message-ID: 54CC2FBA.1040506@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 1/30/15 5:44 PM, Slava Mudry wrote:
> Hi,
> I have to deal with badly written system which regularly suffers from
> transaction wraparound issue. This issue is happenning every 10-14 days
> and forces me to take system offline and vacuum in single-user mode.
> Main causes for this are (afaik):
> 1) heavy transaction traffic + 100+GB of stale tables
> 2) slow i/o (rotational drives)
> 3) autovacuum can't keep up.
>
> Basically the database stores events data in daily partitioned table
> "daily_events".
> What I did, was - I ran vaccum freeze on all partitions (the tables are
> never touched after they're done for a day). I have also scheduled
> vacuum-freeze for a partition after it's done writing.
>
> This essentially set xmin in each partition to "frozen" value of "2".
> However, to my surprise, this was not enough!
> Postgres stores relfrozenxid in pg_class and this value apparently is
> getting old pretty fast (due to high volume of transactions).
> And it seems that it doesn't really matter that xmin is frozen for a
> table, the relfrozenxid is what causing transaction wraparound.

relfrozenxid is only part of the picture. A database-wide freeze vacuum
will be controlled by pg_database.datfrozenxid.

What version is this? You may also be suffering from multixact wrap.

> Why is that? and most importantly - why updating pg_class.relfrozenxid
> requires huge amount of i/o by vacuum process for tables that are never
> updated?

Because it has to scan the entire table to see what the oldest XID is.
We don't check to see if relfrozenxid is already 2, though I suppose we
could add that.

> Is it safe to just update pg_class.relfrozenxid for tables where xmin=2
> for all rows? Same for linked toast table?

That would be a great way to lose data...

You need to look at relations where relfrozenxid is >= 3 and see why
relfrozenxid isn't advancing fast enough on them. Check your cost delay
settings as well as the *freeze* settings. It's very likely that on a
system this busy autovac would never keep up with default settings.

Also, keep in mind that transaction and multixact IDs are cluster-wide,
so this is going to affect all databases in that instance. You should
think about ways to move the heaviest transaction workload to a separate
cluster; possibly putting the raw updates there and having a separate
process that aggregates that data into fewer transactions for the main
cluster.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jim Nasby 2015-01-31 01:34:40 Re: working around JSONB's lack of stats?
Previous Message Slava Mudry 2015-01-30 23:44:00 why pg_class.relfrozenxid needs to be updated for frozen tables (where all rows have xmin=2)?