From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Patrick Earl" <patearl(at)patearl(dot)net> |
Cc: | pgsql-general(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Checkpoint request failed on version 8.2.1. |
Date: | 2007-01-11 20:14:37 |
Message-ID: | 29277.1168546477@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers |
"Patrick Earl" <patearl(at)patearl(dot)net> writes:
> In any case, the unit tests remove all contents and schema within the
> database before starting, and they remove the tables they create as
> they proceed. Certainly there are many things have been recently
> deleted.
Yeah, I think then there's no question that the bgwriter is trying to
fsync something that's been deleted but isn't yet closed by every
process. We have things set up so that that's not a really serious
problem anymore --- eventually it will be closed and then the next
checkpoint will succeed. But CREATE DATABASE insists on checkpointing
and so it's vulnerable to even a transient failure.
I've been resisting changing the checkpoint code to treat EACCES as a
non-error situation on Windows, but maybe we have no choice. How do
people feel about this idea: #ifdef WIN32 and the open or fsync fails
with EACCES, then
1. Emit a LOG (or maybe DEBUG) message noting the problem.
2. Leave the fsync request entry in the hashtable for next time.
3. Allow the current checkpoint to complete normally anyway.
If the file has actually been deleted, then eventually it will be closed
and the next checkpoint will be able to remove the hash entry. If
there's something else wrong, we'll keep bleating and maybe the DBA will
notice eventually.
The downside of this is that a real EACCES problem wouldn't get noted at
any level higher than LOG, and so you could theoretically lose data
without much warning. But I'm not seeing anything else we could do
about it --- AFAIK we have not heard of a way we can distinguish this
case from a real permissions problem. And anyway there should never
*be* a real permissions problem; if there is then the user's been poking
under the hood sufficient to void the warranty anyway ;-)
Comments?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Bruno Wolff III | 2007-01-11 20:18:13 | Re: Remove duplicate rows |
Previous Message | Bruce Momjian | 2007-01-11 20:04:25 | Re: ORDER BY col is NULL in UNION causes error? |
From | Date | Subject | |
---|---|---|---|
Next Message | Magnus Hagander | 2007-01-11 20:19:31 | Re: [HACKERS] Checkpoint request failed on version 8.2.1. |
Previous Message | Alvaro Herrera | 2007-01-11 19:49:28 | Re: [HACKERS] unusual performance for vac following 8.2 upgrade |