Re: checkpoint write errors ( getting worse )

From: CS DBA <cs_dba(at)consistentstate(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: checkpoint write errors ( getting worse )
Date: 2016-10-22 21:35:25
Message-ID: a1b82bc1-027f-4b04-90ac-b571b33590ec@consistentstate.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

So I ran REINDEX on all the db's and the errors went away for a bit. Now
I'm seeing this:

Log entries like this:FATAL: could not read block 0 of relation
base/1311892067/2687: read only 0 of 8192 bytes

So I checked which db it is:

$ psql -h localhost
psql (8.4.20)
Type "help" for help.

postgres=# select datname from pg_database where oid = 1311892067;
datname
---------
access_one
(1 row)

But when I attempt to connect to the db so I can query for the table in
pg_class I get this:

postgres=# \c access_one
FATAL: could not read block 0 of relation base/1311892067/2687: read
only 0 of 8192 bytes

Thoughts?

On 10/22/2016 07:52 AM, CS DBA wrote:
> Thanks the REINDEX fixed it, it's a client of ours and we're pushing
> to get them to move to 9.5
>
>
>
> On 10/21/2016 06:33 PM, Tom Lane wrote:
>> CS DBA <cs_dba(at)consistentstate(dot)com> writes:
>>> we're seeing the below errors over and over in the logs of one of our
>>> postgres databases. Version 8.4.22
>> [ you really oughta get off 8.4, but you knew that right? ]
>>
>>> Anyone have any thoughts on correcting/debugging it?
>>> ERROR: xlog flush request 2571/9C141530 is not satisfied --- flushed
>>> only to 2570/DE61C290
>>> CONTEXT: writing block 4874 of relation base/1029860192/1029863651
>>> WARNING: could not write block 4874 of base/1029860192/1029863651
>>> DETAIL: Multiple failures --- write error might be permanent.
>> Evidently the LSN in this block is wrong. If it's an index, your
>> idea of
>> REINDEX is probably the best solution. If it's a heap block, you could
>> probably make the problem go away by performing an update that
>> changes any
>> tuple in this block. It doesn't even need to be a committed update;
>> that
>> is, you could update or delete any row in that block, then roll back the
>> transaction, and it'd still be fixed.
>>
>> Try to avoid shutting down the DB until you've fixed the problem,
>> else you're looking at replay from whenever the last successful
>> checkpoint was :-(
>>
>>> Maybe I need to run a REINDEX on whatever table equates to
>>> "base/1029860192/1029863651"? If so how do I determine the db and
>>> table
>>> for "base/1029860192/1029863651"?
>> 1029860192 is the OID of the database's pg_database row.
>> 1029863651 is the relfilenode in the relation's pg_class row.
>>
>> regards, tom lane
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2016-10-22 23:59:36 Re: checkpoint write errors ( getting worse )
Previous Message CS DBA 2016-10-22 13:52:13 Re: checkpoint write errors