From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Jan Wieck <JanWieck(at)Yahoo(dot)com> |
Cc: | PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: 7.4.5 losing committed transactions |
Date: | 2004-09-24 22:37:05 |
Message-ID: | 5883.1096065425@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
> Is it somehow possible that the commit record was still sitting in the
> shared WAL buffers (unwritten) when the response got sent to the client?
I don't think so. What I see in the two cases I have now are:
(1) The backend that was doing the "lost" transaction is *not* the one
I kill -9'd. I know this in both cases because I know which table has
the missing entries, and I can see that that instance of the script got
a "WARNING: terminating connection because of crash of another server
process" message rather than just a connection closure.
(2) There's a pretty fair distance in the WAL log between the entries
made by the "lost" transaction and the checkpoint made by recovery ---
a dozen or so other transactions were made and committed in between.
It seems unlikely that this transaction would have been the only one to
lose a WAL record if something like that had happened.
What I'm currently speculating about is that there might be some
weirdness associated with the very act of sending out the WARNING.
quickdie() isn't doing anything to ensure that the system is in a good
state before it calls ereport --- which is probably not so cool
considering it is a signal handler. It might be wise to reset at least
the elog.c state before doing this.
Can you still reproduce the problem if you take out the ereport call
in quickdie()?
BTW, what led you to develop this test setup ... had you already seen
something that made you suspect a data loss problem?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Cott Lang | 2004-09-24 23:30:49 | implosion follow up, 7.4.5 |
Previous Message | Jan Wieck | 2004-09-24 22:17:35 | Re: 7.4.5 losing committed transactions |