From: | Marco Colombo <marco(at)esi(dot)it> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | lec <limec(at)streamyx(dot)com>, pgsql-general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Losing records when server hang |
Date: | 2004-08-09 17:18:05 |
Message-ID: | 4117B1CD.2020500@esi.it |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Tom Lane wrote:
> lec <limec(at)streamyx(dot)com> writes:
>
>>It's a SCSI, RAID-5 on a Dell server.
>
>
>>The hardware actually "hang". The Dell engineers came and replaced the
>>motherboard but couldn't tell what the actual fault was.
>
>
>>Commit as in 'COMMIT'. 'Records' 1,2,3,4,5,6,7,8,9,10 are actually
>>transactions. I'm as puzzled as to why I lost the transactions in the
>>middle but got the last transaction.
>
>
> I'm puzzled too. I don't suppose you have the postmaster log from when
> it was trying to recover from the crash? Or even better, copies of the
> WAL files?
>
> A possible theory has to do with corruption of the WAL log. For
> instance, transactions 1-10 are all down to disk in WAL (or at least the
> kernel told postgres the writes were done) and for one reason or another
> the buffer manager chances to flush the page containing record 10 out
> to its data file before the other records' pages. Now the system hangs.
> After reboot, if the WAL log is unreadable beyond transaction 1 then the
> database would come up with transaction 1 replayed, 2-10 not replayed,
> but 10's data is out there anyway.
>
> However this would seem to imply disk drive misfeasance above and beyond
> your motherboard problem.
Well, no. How about this theory:
1) everything is ok:
the backend executes write()/fsync() for transactions 1-5
2) hardware fails some how at MB level (imagine CPU/RAM overheating):
RAM gets corrupted - kernel starts oopsing (but goes on)
meanwhile, the backend executes write()/fsync() for transactions 6-10,
but randomly corrupted data gets written to disk.
3) unrecoverable kernel error occurs, the show stops.
On recover, transactions 6-9 don't even look like valid log entries, while
10, for some reason, does (maybe only data is corrupted).
I'm not familiar with the details of WAL files and post-crash recovery,
but is that possible? Or does the process stop at the first failure?
Anyway, if your CPU/RAM is failing, no DB technology can save you. You
need redundant CPU/RAM units to perform the same operations concurrenly,
and the hardware to validate the results on a 2vs1 basis at least.
Ask NASA, I think they know what "mission critical" actually means. :)
Really, when the hardware starts flipping random bits in RAM, you can't
even know how long it's being going on, can be hours w/o the kernel panic
or hang at all. No one knows how good is your data. There's no point in
recovering a transaction if the data inside is corrupted.
.TM.
--
____/ ____/ /
/ / / Marco Colombo
___/ ___ / / Technical Manager
/ / / ESI s.r.l.
_____/ _____/ _/ Colombo(at)ESI(dot)it
From | Date | Subject | |
---|---|---|---|
Next Message | Kevin Bartz | 2004-08-09 17:25:35 | Re: Out of swap space & memory |
Previous Message | Brigitte ROLLAND | 2004-08-09 17:03:03 | Problems with MS Visual Basic 6.0 |