Could not open file "pg_subtrans/01EB"

From: Mariel Cherkassky <mariel(dot)cherkassky(at)gmail(dot)com>
To: pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: Could not open file "pg_subtrans/01EB"
Date: 2018-08-26 09:29:32
Message-ID: CA+t6e1=fYtSwS=X4mZ=gS-HP6N9Sc4gacZB4uR_+05wAwrzreA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hi,
I'm trying to investigate a database of one of our clients. The database
version is 9.2.5. The client tried to dump one of its databases and then
got the next error :

pg_dump: query returned 2 rows instead of one: SELECT tableoid, oid,
(SELECT rolname FROM pg_catalog.pg_roles WHERE oid = datdba) AS dba,
pg_encoding_to_char(encoding) AS encoding, datcollate, datctype,
datfrozenxid, (SELECT spcname FROM pg_tablespace t WHERE t.oid =
dattablespace) AS tablespace, shobj_description(oid, 'pg_database') AS
description FROM pg_database WHERE datname = 'db1'

So I tried to query the pg_database and I saw that there are duplicated
rows in that table :
postgres=# select xmin,xmax,datname,datfrozenxid from pg_database order by
datname;
xmin | xmax | datname | datfrozenxid
-------+----------+----------------+--------------
2351 | 0 | db1 | 1798
1809 | 21093518 | db1 | 1798
1806 | 0 | postgres | 1798
12594 | 0 | db2 | 1798
1803 | 0 | template0 | 1798
1802 | 0 | template1 | 1798
3590 | 0 |db4 | 1798
3592 | 0 | db3 | 1798
1811 | 21077312 | db3 | 1798
(9 rows)

-fsync and full_page_write are set to on.

I changed the databases names but as you can see db1/db3 have duplicated
records. I tried to dump the postgresql database and it worked. I run
vacuum on the problematic databases : connected to db1/2/3 and run the
vacuum command. On many of the object I got the next detail message :

DETAIL: x dead row versions cannot be removed yet. I'm the only one
working on the database and there are no additional session in
pg_stat_activity. So when some of the row versions cannot be removed ?

I tried to reindex the problematic databases but got the next error :
reindexdb: reindexing of database "db1" failed: ERROR: could not access
status of transaction 32212695
DETAIL: Could not open file "pg_subtrans/01EB": No such file or directory.
I checked and indeed that file doesn't exist.

I restarted the cluster and I got the same error for every database (in all
cases analyze of pg_catalog.pg_shdepend" failed and caused the error) in
the log file.

2018-05-06 23:46:54 +08 30185 DETAIL: Could not open file
"pg_subtrans/01EB": No such file or directory.
2018-05-06 23:46:54 +08 30185 CONTEXT: automatic analyze of table
"afa.pg_catalog.pg_shdepend"
2018-05-06 23:47:06 +08 30213 ERROR: could not access status of
transaction 32635595

I generated a new empty subtrans file that will be called 01EB and I
restarted my cluster :
dd if=/dev/zero of=/var/lib/pgsql/data/pg_subtrans/01EB bs=256k count=1

I didnt get any errors in the log of the database.

Afterwards, I still had duplicated values in pg_databases. I tried again to
reindex the problematic databases :

[root(at)my_host pg_subtrans]# reindexdb db1 -U postgres
Password:
NOTICE: table "pg_catalog.pg_class" was reindexed

and it is just stuck from that point and didnt advanced to other tables..
In pg_stat_activity I dont see that the state_change is changing.

Any idea how can I further continue ?

Thanks.

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Mariel Cherkassky 2018-08-26 12:10:57 Re: Could not open file "pg_subtrans/01EB"
Previous Message Andres Freund 2018-08-25 14:43:03 Re: ERROR: no known snapshots