Re: Template zero xid issue

From: "Keaton Adams" <kadams(at)mxlogic(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Template zero xid issue
Date: 2007-08-06 18:20:45
Message-ID: 0B34A6972BF39E4CB465A64DBBAD2BB902261DCB@mxlhq-exch01.corp.mxlogic.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


We needed the DB up right away; one of the operations guys connected
through a standalone backend to vacuum template0 and now the DB is up
again:

"sudo -u postgres /mxl/var/pgsql/bin/postgres -D /mxl/var/pgsql/data -O
-P template0

Once I got in, I just ran a VACUUM VERBOSE ANALYZE; to "reset"
the XID counter."

So just to be clear on this statement:

"You misunderstand what you are looking at --- that's the cluster-wide
XID consumption rate, not any one database's."

In the docs it says that "the value stored in the datfrozenxid field is
the freeze cutoff XID that was used by the last database-wide vacuum
operation. All normal XIDs older than this cutoff XID are guaranteed to
have been replaced by FrozenXID within that database." The query to
check is:

select datname, age(datfrozenxid) from pg_database;

"The age column measures the number of transactions from the cutoff XID
to the current transaction's XID. When the age approaches two billion,
the database must be vacuumed again to avoid risk of wraparound
failures."

So, if XID is database instance (cluster-wide), that would explain why
the actual increment of the age value is the same across all DB's, even
though the actual XID numbers are different per db?

bash-3.00$ date; psql -dpostgres -frun.sql
Mon Aug 6 11:20:03 MDT 2007
datname | age
--------------+------------
postgres | 1075005030
mxl | 1075004446
.
.
.
template1 | 1074931654
template0 | 1074740691
(7 rows)

bash-3.00$ date; psql -dpostgres -frun.sql
Mon Aug 6 11:23:53 MDT 2007
datname | age
--------------+------------
postgres | 1075046568
mxl | 1075045984
.
.
.
template1 | 1074973192
template0 | 1074782229
(7 rows)

Postgres increment: 41,538
Mxl increment: 41,538
Template 1 increment: 41,538
Template 0 increment: 41,538

But, the actual reset of the FrozenXID needs to take place at the
individual DB level, and that's where the full-db vacuum comes into
play, correct?

And the bug in 8.1.4 is allowing the FrozenXID on template0 to increment
when it shouldn't be doing so, and since the normal vacuum ignores a
read-only db such as template0, that is why we had to force a vacuum
after the DB shut down?

Do I have this right?

Thanks again,

Keaton

-----Original Message-----
From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
Sent: Monday, August 06, 2007 11:02 AM
To: Joshua D. Drake
Cc: Keaton Adams; pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] Template zero xid issue

"Joshua D. Drake" <jd(at)commandprompt(dot)com> writes:
> 2. IIRC there is a bug in 8.1.4 that causes template0 to increment
when
> it shouldn't. Update the template0 to allow connections, vacuum full,
> dissallow connections, and then upgrade.

That should be "vacuum freeze" but otherwise this is the correct
recovery process. However, before destroying the evidence I'd like to
find out how he got into this state ...

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ed L. 2007-08-06 18:23:07 8.1.2 select for update issue
Previous Message Jeff Davis 2007-08-06 18:19:41 Re: new line in psotgres