From: | Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Are we accepting cancel interrupts too often? |
Date: | 2001-12-31 16:41:56 |
Message-ID: | 200112311641.fBVGfuP28721@candle.pha.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > I started to look at when this nice code was added to determine if this
> > was part of the original design or added later and found you wrote it
> > yourself, so I guess we don't have to ask anyone to make sure there
> > isn't something were are missing.
>
> As far as I can recall my thinking at the time, it went like so:
> "We *should* be able to accept a cancel interrupt anywhere we are not
> actually in the midst of modifying shared-memory data structures,
> because after all the database system is supposed to be robust against
> crashes, and those could happen anyplace".
>
> But the fallacy in equating a cancel to a crash is that we have rather
> extensive logic for coping with a crash (including reinitializing shared
> memory from scratch). A cancel will only provoke elog cleanup, which is
> not nearly as thorough. For example, it's not obvious that shared
> memory structures that are protected by different locks couldn't get out
> of sync.
>
Yes, I saw the RESUME_INTERRUPTS in SpinLockRelease(). It seems very
aggresive to allow a query cancel there.
>
> BTW, I spent some time yesterday trying to use this worry to explain my
> latest favorite bugaboo, the duplicate-rows complaints we've gotten from
> a few people. It is easy to see that a cancel being accepted at the
> right place (exit from the first WriteBuffer in heap_update) could leave
> an updated tuple created and its buffer marked dirty, while the old
> tuple's buffer is not yet marked dirty and might therefore be discarded
> unwritten. (The WAL entry is correct but will never be consulted unless
> there's a crash.) However, this scenario doesn't seem to explain the
> failures because the cancel would lead to transaction abort, so the
> updated tuple should never be considered good anyway. Back to the
> drawing board...
I thought we were seeing duplicates in 7.1, which didn't have this code.
--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2001-12-31 17:02:30 | Re: Are we accepting cancel interrupts too often? |
Previous Message | Tom Lane | 2001-12-31 16:32:57 | Re: Are we accepting cancel interrupts too often? |