Re: Could not open file "pg_subtrans/01EB"

From: Mariel Cherkassky <mariel(dot)cherkassky(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: Re: Could not open file "pg_subtrans/01EB"
Date: 2018-08-28 11:33:08
Message-ID: CA+t6e1=NEhMCrZS9bcrhjAbuO_SVyAu2n98zSuY7xrNoZKitRA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Any idea what should I do ?

‫בתאריך יום ב׳, 27 באוג׳ 2018 ב-15:05 מאת ‪Mariel Cherkassky‬‏ <‪
mariel(dot)cherkassky(at)gmail(dot)com‬‏>:‬

> So As Tom Lane suggested I upgraded the database to the 9.2.24 version. I
> vacuumed (vacuum verbose) all the databases and for one of the databases
> (that has a duplicated record in pg_database) I got the next meesages for
> two of the obejcts :
> 2018-08-27 16:57:59 +08 db4 22026 WARNING: relation "a" page 1560 is
> uninitialized --- fixing
> 2018-08-27 16:58:00 +08 db4 22026 WARNING: relation "a" page 1561 is
> uninitialized --- fixing
> 2018-08-27 16:58:00 +08 db4 22026 WARNING: relation "a" page 1562 is
> uninitialized --- fixing
> 2018-08-27 16:58:00 +08 db4 22026 WARNING: relation "a" page 1563 is
> uninitialized --- fixing
> 2018-08-27 16:58:00 +08 db4 22026 WARNING: relation "a" page 1564 is
> uninitialized --- fixing
> 2018-08-27 16:58:00 +08 db4 22026 WARNING: relation "a" page 1565 is
> uninitialized --- fixing
> 2018-08-27 16:58:00 +08 db4 22026 WARNING: relation "a" page 1566 is
> uninitialized --- fixing
> 2018-08-27 16:58:00 +08 db4 22026 WARNING: relation "a" page 1567 is
> uninitialized --- fixing
> 2018-08-27 16:58:00 +08 db4 22026 WARNING: relation "a" page 1568 is
> uninitialized --- fixing
> 2018-08-27 16:58:00 +08 db4 22026 WARNING: relation "a" page 1569 is
> uninitialized --- fixing
> 2018-08-27 16:58:23 +08 db4 22026 WARNING: relation "b" page 1031 is
> uninitialized --- fixing
> 2018-08-27 16:58:23 +08 db4 22026 WARNING: relation "b" page 1032 is
> uninitialized --- fixing
> 2018-08-27 16:58:23 +08 db4 22026 WARNING: relation "b" page 1033 is
> uninitialized --- fixing
> 2018-08-27 16:58:23 +08 db4 22026 WARNING: relation "b" page 1034 is
> uninitialized --- fixing
> 2018-08-27 16:58:23 +08 db4 22026 WARNING: relation "b" page 1035 is
> uninitialized --- fixing
> 2018-08-27 16:58:23 +08 db4 22026 WARNING: relation "b" page 1036 is
> uninitialized --- fixing
> 2018-08-27 16:58:23 +08 db4 22026 WARNING: relation "b" page 1037 is
> uninitialized --- fixing
>
> I tried re indexing the problematic databases :
> db1>reindex system db1
> 2018-08-27 17:09:44 +08 db1 23218 ERROR: could not access status of
> transaction 32212695
> 2018-08-27 17:09:44 +08 db1 23218 DETAIL: Could not open file
> "pg_subtrans/01EB": No such file or directory.
>
> I tried to delete the duplicated rows in pg_database but the delete doesnt
> delete anything :
>
> select ctid,xmin,xmax,datname from pg_database order by datname;
> ctid | xmin | xmax | datname
> --------+-------+----------+----------------
> (0,21) | 2351 | 0 | db1
> (0,4) | 1809 | 21093518 | db1
> (0,3) | 1806 | 0 | postgres
> (0,24) | 12594 | 0 | db2
> (0,2) | 1803 | 0 | template0
> (0,1) | 1802 | 0 | template1
> (0,22) | 3590 | 0 | db3
> (0,23) | 3592 | 0 | db4
> (0,5) | 1811 | 21077312 | db4
> (9 rows)
>
> As you can see db1 and db4 have duplicated records. I tried to delete them
> :
> delete from pg_database where ctid='(0,4)';
> DELETE 0
>
> but the record does exist :
> select ctid,datname from pg_database where ctid='(0,4)';
> ctid | datname
> -------+---------
> (0,4) | db1
> (1 row)
>
> I set the zero_damaged_pages to on but it didnt help either.
>
> How can I continue ?
>
>
> ‫בתאריך יום א׳, 26 באוג׳ 2018 ב-19:42 מאת ‪Mariel Cherkassky‬‏ <‪
> mariel(dot)cherkassky(at)gmail(dot)com‬‏>:‬
>
>> 1.I'm not really sure if they have one but its not my business.. I'm just
>> trying to help those guys with an application that is based on postgres...
>> 2.Yeah I realized that it is an option, but do you really think that it
>> should be the first solution ? I tried to search for bugs that seems
>> identical to my case but I didnt find any.
>> 3.I set the vacuum_freeze_table_age to 0 and vacuumed all the duplicated
>> databases but it seems that it didn't solve their problem.
>> 4.I tried to delete the records according to the value in xmin/xmax and
>> the result was that after the delete postgresql didnt recognize that I have
>> those databases. I queried the ctid and the xmin/xman :
>> postgres=# select xmin,xmax,datname,ctid from pg_Database;
>> xmin | xmax | datname | ctid
>> -------+----------+----------------+--------
>> 1802 | 0 | template1 | (0,1)
>> 1803 | 0 | template0 | (0,2)
>> 1806 | 0 | postgres | (0,3)
>> 1809 | 21093518 | db1 | (0,4)
>> 1811 | 21077312 | db2 | (0,5)
>> 2351 | 0 | db1 | (0,21)
>> 3590 | 0 | db3 | (0,22)
>> 3592 | 0 | db2 | (0,23)
>> 12594 | 0 | db4 | (0,24)
>>
>> I tried to delete db1 with xmax 0 and db2 with xmax 2 (the opposite of
>> what you suggested with the ctid). Now, I thought that if the xmax is set
>> to 0 it means that I "didnt have" any update / delete operations and
>> therefore I should delete those databases. Does it matters which one to
>> delete ? I read about the xmin/xmax/ctid columns but I thought that what I
>> did was legit.
>>
>> Moreover I realized that I have duplicated rows in more system tables so
>> i don`t really like this solution.
>>
>> My next attempt will be to upgrade to 9.2.24.
>>
>> Thanks Tom ! ,
>>
>> Regards Mariel.
>>
>> ‫בתאריך יום א׳, 26 באוג׳ 2018 ב-18:51 מאת ‪Tom Lane‬‏ <‪tgl(at)sss(dot)pgh(dot)pa(dot)us
>> ‬‏>:‬
>>
>>> Mariel Cherkassky <mariel(dot)cherkassky(at)gmail(dot)com> writes:
>>> > I'm trying to investigate a database of one of our clients. The
>>> database
>>> > version is 9.2.5.
>>>
>>> 1. Fire their DBA for dereliction of duty.
>>>
>>> 2. Update to the last available release of 9.2.x (9.2.24, looks like).
>>>
>>> 3. Vacuum everything and see if it gets better.
>>>
>>> Vacuuming may or may not fix the observed data corruption problems, but
>>> it's silly to ignore the fact that they're missing four years worth
>>> of bug fixes in that branch. In particular I'm noticing the first
>>> entry in the change notes for 9.2.6, which recommends "vacuuming all
>>> tables in all databases while having vacuum_freeze_table_age set to
>>> zero":
>>> https://www.postgresql.org/docs/9.2/static/release-9-2-6.html
>>> That problem as-described seems to involve rows disappearing, rather than
>>> getting duplicated, but I wouldn't rule out that it could also have
>>> symptoms like this.
>>>
>>> If that doesn't fix things, you could then try:
>>>
>>> 4. Manually eliminate duplicate rows by identifying which one of each
>>> pair
>>> seems older and deleting it with DELETE ... WHERE ctid = '(x,y)'. Then
>>> reindex to confirm no duplicates remain.
>>>
>>> But you still need step 2, to reduce the odds the problem will recur.
>>>
>>> regards, tom lane
>>>
>>

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Scott Ribe 2018-08-28 12:11:52 Re: How to get alerted automatically whenever a table structure is changed between Publisher and Subscriber in Logical Replication?
Previous Message wambacher 2018-08-28 11:26:34 Re: tuple concurrently updated