Re: Need help with error

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Steven Saner <ssaner(at)pantheranet(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Need help with error
Date: 2000-07-05 20:29:16
Message-ID: 4185.962828956@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Steven Saner <ssaner(at)pantheranet(dot)com> writes:
> Using Postgres 7.0 on BSDI 4.1
> For the last several days we are getting errors that look like this:

> Error: cannot write block 0 of krftmp4 [adm] blind.

> An interesting thing is that in this example, krftmp4 is a table that
> the user that got this error message would not have accessed in any
> way.

Right --- that's implicit in the blind-write logic. A blind write
means trying to dump out a dirty page from the shared buffer pool
that belongs to a relation your own backend hasn't touched.

Since the write fails, the dirty block remains in the shared buffer
pool, waiting for some other backend to try to dump it again and fail
again :-(

The simplest recovery method is to restart the postmaster, causing a new
buffer pool to be set up.

However, from a developer's perspective, I'm more interested in finding
out how you got into this state in the first place. We thought we'd
fixed all the bugs that could give rise to orphaned dirty blocks, which
was the cause of this type of error in all the cases we'd seen so far.
Perhaps there is still a remaining bug of that kind, or maybe you've
found a new way to cause this problem. Do you have time to do some
investigation before you restart the postmaster?

One thing I'd like to know is why the write is failing in the first
place. Have you deleted or renamed the krftmp4 table, or its containing
database adm, probably not too long before these errors started
appearing?

> When this happens, it seems that the backend dies, which
> ends up causing the backend connections for all users to die.

That shouldn't be happening either; blind write failure is classed as
a simple ERROR, not a FATAL error. Does any message appear in the
postmaster log? Is a corefile dumped, and if so what do you get from
a backtrace?

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Barry Brown 2000-07-05 20:29:50 Indexing of geometric data
Previous Message mikeo 2000-07-05 20:23:54 Re: responses to licensing discussion