Re: four template0 databases after vacuum

From: Melvin Davidson <melvin6925(at)gmail(dot)com>
To: Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: four template0 databases after vacuum
Date: 2016-02-07 15:14:06
Message-ID: CANu8Fiwfb_SmE=QfEgVoCK9NEy94Fo0N9bML=nj8M14xhviyGw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sun, Feb 7, 2016 at 4:51 AM, Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>
wrote:

> Just a shot in the dark for a possible lead to follow down (sorry for
> top-posting):
>
> Is there index corruption on system tables ?
>
> (like, several index entries pointing to the one template0 row)
>
> Karsten
>
>
> *Gesendet:* Sonntag, 07. Februar 2016 um 03:43 Uhr
> *Von:* "Kazuaki Fujikura" <fujya(at)fujya(dot)com>
> *An:* pgsql-general(at)postgresql(dot)org
> *Betreff:* [GENERAL] four template0 databases after vacuum
> Hi there,
>
> Version: 9.1.6 running since Dec, 2014
> We have 3 different databases.
>
> [problem history/background]
>
> Jan 10th, 2016:
> The first problem was autovacuum issue.
> - autovacuum could not finish successfully.
> - I set autovacuum_freeze_max_age to 2 hundreds million.
> - autovacuum immediately finished against the database which age was over
> 2 hundreds million.
> - so, autovacuum did not go next database
> - I then run "vacuumdb -az" and run vacuum freeze analyze against
> template0 after setting datallowconn to true
> - I set datallowconn to false
>
> ============================================================
> $ vacuumdb -az
> $ psql template1
> template1=# UPDATE pg_database SET datallowconn = TRUE where datname =
> 'template0';
> UPDATE 1
> template1=# \c template0
> template0=# VACUUM FREEZE ANALYZE ;
> VACUUM
> template0=# \c template1
> template1=# UPDATE pg_database SET datallowconn = FALSE where datname =
> 'template0';
> UPDATE 1
> ============================================================
>
> - At this point, there is one template0 only
>
>
> Jan 30th, 2016:
> After three weeks, I again hit the same issue - autovacuum could not
> finish.
> This time, the age did not reduce with manual vacuum.
> I then run vacuum full to pg_database. The age of pg_database becomes
> minus value.
> Then, autovacuum started again.
>
> ============================================================
> target_db=# SELECT relname, age(relfrozenxid) FROM pg_class WHERE relkind
> = 'r' ;
> relname | age
> ------------------------------------------------+-----------
> pg_database | 219383067
> target_db=# VACUUM FREEZE ;
> VACUUM
> target_db=# SELECT relname, age(relfrozenxid) FROM pg_class WHERE relkind
> = 'r' ;
> relname | age
> ------------------------------------------------+-----------
> pg_database | 219387307
> target_db=# VACUUM FULL ;
> VACUUM
> relname | age
> ------------------------------------------------+-----------
> pg_database | -1861408089
> ============================================================
>
>
>
> Yesterday:
>
> I run the following command to run vacuum full to all pg_database.
> Then, I run vacuum freeze analyze and vacuum full to template0 after
> setting datallowconn.
>
> ============================================================
> $ psql -lt | awk '{print $1}' | grep -v ^$ | grep -v ^\| | while read
> line; do psql ${line} -c "VACUUM FULL pg_database;"; done
> VACUUM ....
>
> $ psql template1
> template1=# UPDATE pg_database SET datallowconn = TRUE where datname =
> 'template0';
> UPDATE 1
> template1=# \c template0
> template0=# VACUUM FULL ;
> VACUUM
> template0=# \c template1
> template1=# UPDATE pg_database SET datallowconn = FALSE where datname =
> 'template0';
> UPDATE 4
> ============================================================
>
> ***Then I HAVE FOUR template0 DATABASES***
>
>
>
> [Current problems]
>
> We now have three issues in our production.
>
> 1. It looks four template0 databases exist
> 2. Xid of template0 keeps growing
> 3. Can not freeze xid of template0
>
>
>
> 1. It looks four template0 databases exist
>
> ============================================================
> $ psql -l | grep template0
> template0 | postgres | UTF8
> | C | C | =c/postgres +
> template0 | postgres | UTF8
> | C | C | =c/postgres +
> template0 | postgres | UTF8
> | C | C | =c/postgres +
> template0 | postgres | UTF8
> | C | C | =c/postgres +
> ============================================================
>
> These have same dataid.
>
> ============================================================
> postgres=# SELECT datid, datname FROM pg_stat_database where datname =
> 'template0';
> datid | datname
> -------+-----------
> 12772 | template0
> 12772 | template0
> 12772 | template0
> 12772 | template0
> (4 rows)
> ============================================================
>
>
>
> 2. Xid of template0 keeps growing
>
> ============================================================
> postgres=# SELECT datname, age(datfrozenxid) FROM pg_database order by age
> desc;
> datname | age
> ------------------------------------------------+-----------
> template0 | 198431852
> template0 | 198431852
> template0 | 198431852
> template0 | 50480024
> template1 | 45629585
> ============================================================
>
> At this moment, the maximum age value of all databases is template0.
> The age value keeps growing.
>
> One of 4 template0 is young (504080024). Other three template0s are still
> old.
>
>
> 3. Can not freeze xid of template0
>
> To reset xid of template0, I did vacuum full/ vacuum freeze to template0.
> But,
> the age of three template0 did not change. Only of of 4 template0 had
> successfully
> changed the age young.
>
> ============================================================
> $ psql template1
> template1=# UPDATE pg_database SET datallowconn = TRUE where datname =
> 'template0';
> UPDATE 4
> template1=# \c template0
> template0=# VACUUM FREEZE ANALYZE ;
> VACUUM
> template0=# VACUUM FULL ;
> VACUUM
> template0=# \c template1
> template1=# UPDATE pg_database SET datallowconn = FALSE where datname =
> 'template0';
> UPDATE 4
> ============================================================
>
> I run the commands above. But, I could not change the age of three
> template0 databases.
>
>
>
> [My idea to fix this]
>
> If I don't do anything about this, I think our production service will be
> down because it exceeds the limit of xid.
>
> I guess if I drop all template0 and create template0 again, then
> everything gets back normal.
> But I am not quite sure if my approach is right.
>
> I would appreciate any suggestion/comments.
>
> Best regards,
> Kazuaki Fujikura
>
With regards to Karsten's thought, here is a query to find any pg_catalog
indexes that are corrupt.

SELECT n.nspname as schema,
i.relname as table,
i.indexrelname as index,
i.idx_scan,
i.idx_tup_read,
i.idx_tup_fetch,
CASE WHEN idx.indisprimary
THEN 'pkey'
WHEN idx.indisunique
THEN 'uidx'
ELSE 'idx'
END AS type,
pg_get_indexdef(idx.indexrelid),
CASE WHEN idx.indisvalid
THEN 'valid'
ELSE 'INVALID'
END as statusi,
pg_relation_size(quote_ident(n.nspname)|| '.' ||
quote_ident(i.relname)) as size_in_bytes,
pg_size_pretty(pg_relation_size(quote_ident(n.nspname)|| '.' ||
quote_ident(i.relname))) as size
FROM pg_stat_all_indexes i
JOIN pg_class c ON (c.oid = i.relid)
JOIN pg_namespace n ON (n.oid = c.relnamespace)
JOIN pg_index idx ON (idx.indexrelid = i.indexrelid )
WHERE n.nspname = 'pg_catalog'
AND NOT idx.indisvalid
ORDER BY 1, 2, 3;

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2016-02-07 17:31:54 Re: four template0 databases after vacuum
Previous Message FarjadFarid(ChkNet) 2016-02-07 13:39:57 Re: Asp.net 5 and EF6