Re: IO Timeout

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alex Turner <armtuk(at)gmail(dot)com>
Cc: pgsql-admin(at)postgresql(dot)org, Alex Turner <aturner(at)neteconomist(dot)com>
Subject: Re: IO Timeout
Date: 2005-03-11 04:09:07
Message-ID: 28300.1110514147@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-general

Alex Turner <armtuk(at)gmail(dot)com> writes:
> Well - I am sort of trying to piece together exactly what happened.
> Here's what I know.

> Around 02:52 I get messages in my syslog stating that there were
> problems writing to a controler channel:
> [ various hardware errors snipped ]

> At around 07:30 all connections were failing giving the error:
> InternalError: FATAL: the database system is in recovery mode

I think what happened here is that Postgres got a write error on WAL,
which would probably cause a PANIC, and then the ensuing database reboot
got hung up trying to re-read WAL. Client connection requests would be
refused with messages like the above until the recovery process
completed. The fact that this was still going on 4+ hours later shows
that Postgres is *not* timing out on stuck disk operations ... very much
the reverse in fact.

You'd be best off to take the matter up with some kernel hackers.
If there's anything to be done to improve the behavior, it's at
the kernel device driver level.

regards, tom lane

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Tom Lane 2005-03-11 04:41:29 Re: 7.4.5 file write issue
Previous Message Bruce Momjian 2005-03-11 03:21:40 Re: 7.4.5 file write issue

Browse pgsql-general by date

  From Date Subject
Next Message Kris Jurka 2005-03-11 04:49:26 Re: SRF, JDBC and result info
Previous Message Alex Turner 2005-03-11 02:19:19 Re: IO Timeout