From: | "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com> |
---|---|
To: | Gavin Sherry <swm(at)linuxworld(dot)com(dot)au> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)", |
Date: | 2005-10-28 17:26:03 |
Message-ID: | 20051028172602.GH13187@pervasive.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Oct 28, 2005 at 02:26:31PM +1000, Gavin Sherry wrote:
> Have spoken with Jim on IRC, he says that there have been several crashes
> recently due to a faulty disk array. I guess the zeroing could be an
> outcome of the faulty disk. I wonder if the crash the faulty disk resulted
> in could have been caused some where around mdextend() where we create a
> zero'd page but before we could have written out the initialised page.
Just to clarify, there's no evidence that the array is faulty. I do know
that they were using write-back with a non-battery-backed cache though.
What has been happening is periodic random crashes, around 1 a week. I
now have a good core for one, as well as an assert:
TRAP: FailedAssertion("!(shared->page_number[slotno] == pageno &&
shared->page_status[slotno] == SLRU_PAGE_READ_IN_PROGRESS)", File:
"slru.c", Line: 308)
I haven't looked at that code yet, so I have no idea what that actually
means. Let me know what info y'all would like to see out of the core.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2005-10-28 17:32:52 | Re: [GENERAL] aix build question re: duplicate symbol warning |
Previous Message | Alvaro Herrera | 2005-10-28 16:52:25 | Re: ERROR: invalid memory alloc request size <a_big_number_here> |