From: | "Andrew Hammond" <andrew(dot)george(dot)hammond(at)gmail(dot)com> |
---|---|
To: | "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Hackers <pgsql-hackers(at)postgresql(dot)org>, "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> |
Subject: | Re: the un-vacuumable table |
Date: | 2008-07-07 19:00:12 |
Message-ID: | 5a0a9d6f0807071200h4895ecd5m6eee060ab4ea2953@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Jul 3, 2008 at 10:57 PM, Andrew Hammond
<andrew(dot)george(dot)hammond(at)gmail(dot)com> wrote:
> On Thu, Jul 3, 2008 at 3:47 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Have you looked into the machine's kernel log to see if there is any
>> evidence of low-level distress (hardware or filesystem level)? I'm
>> wondering if ENOSPC is being reported because it is the closest
>> available errno code, but the real problem is something different than
>> the error message text suggests. Other than the errno the symptoms
>> all look quite a bit like a bad-sector problem ...
da1 is the storage device where the PGDATA lives.
Jun 19 03:06:14 db1 kernel: mpt1: request 0xffffffff929ba560:6810
timed out for ccb 0xffffff0000e20000 (req->ccb 0xffffff0000e20000)
Jun 19 03:06:14 db1 kernel: mpt1: request 0xffffffff929b90c0:6811
timed out for ccb 0xffffff0001081000 (req->ccb 0xffffff0001081000)
Jun 19 03:06:14 db1 kernel: mpt1: request 0xffffffff929b9f88:6812
timed out for ccb 0xffffff0000d93800 (req->ccb 0xffffff0000d93800)
Jun 19 03:06:14 db1 kernel: mpt1: attempting to abort req
0xffffffff929ba560:6810 function 0
Jun 19 03:06:14 db1 kernel: mpt1: request 0xffffffff929bcc90:6813
timed out for ccb 0xffffff03e132dc00 (req->ccb 0xffffff03e132dc00)
Jun 19 03:06:14 db1 kernel: mpt1: completing timedout/aborted req
0xffffffff929ba560:6810
Jun 19 03:06:14 db1 kernel: mpt1: abort of req 0xffffffff929ba560:0 completed
Jun 19 03:06:14 db1 kernel: mpt1: attempting to abort req
0xffffffff929b90c0:6811 function 0
Jun 19 03:06:14 db1 kernel: mpt1: completing timedout/aborted req
0xffffffff929b90c0:6811
Jun 19 03:06:14 db1 kernel: mpt1: abort of req 0xffffffff929b90c0:0 completed
Jun 19 03:06:14 db1 kernel: mpt1: attempting to abort req
0xffffffff929b9f88:6812 function 0
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): WRITE(16). CDB: 8a 0 0 0
0 1 6c 99 9 c0 0 0 0 20 0 0
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): CAM Status: SCSI Status Error
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): SCSI Status: Check Condition
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): UNIT ATTENTION asc:29,0
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): Power on, reset, or bus
device reset occurred
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): Retrying Command (per Sense Data)
Jun 19 03:06:14 db1 kernel: mpt1: completing timedout/aborted req
0xffffffff929b9f88:6812
Jun 19 03:06:14 db1 kernel: mpt1: abort of req 0xffffffff929b9f88:0 completed
Jun 19 03:06:14 db1 kernel: mpt1: attempting to abort req
0xffffffff929bcc90:6813 function 0
Jun 19 03:06:14 db1 kernel: mpt1: completing timedout/aborted req
0xffffffff929bcc90:6813
Jun 19 03:06:14 db1 kernel: mpt1: abort of req 0xffffffff929bcc90:0 completed
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): WRITE(16). CDB: 8a 0 0 0
0 1 65 1b 71 a0 0 0 0 20 0 0
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): CAM Status: SCSI Status Error
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): SCSI Status: Check Condition
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): UNIT ATTENTION asc:29,0
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): Power on, reset, or bus
device reset occurred
Jun 19 03:06:14 db1 kernel: (da1:mpt1:0:0:0): Retrying Command (per Sense Data)
Jun 19 03:18:16 db1 kernel: mpt3: request 0xffffffff929d5900:56299
timed out for ccb 0xffffff03df7f5000 (req->ccb 0xffffff03df7f5000)
I think this is a smoking gun.
Andrew
From | Date | Subject | |
---|---|---|---|
Next Message | David E. Wheeler | 2008-07-07 19:03:15 | Re: PATCH: CITEXT 2.0 |
Previous Message | Zdenek Kotala | 2008-07-07 18:57:38 | Re: PATCH: CITEXT 2.0 v2 |