From: | Noah Misch <noah(at)leadboat(dot)com> |
---|---|
To: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: GIN data corruption bug(s) in 9.6devel |
Date: | 2016-04-22 06:00:46 |
Message-ID: | 20160422060046.GC2042217@tornado.leadboat.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Apr 18, 2016 at 05:48:17PM +0300, Teodor Sigaev wrote:
> >>Added, see attached patch (based on v3.1)
> >
> >With this applied, I am getting a couple errors I have not seen before
> >after extensive crash recovery testing:
> >ERROR: attempted to delete invisible tuple
> >ERROR: unexpected chunk number 1 (expected 2) for toast value
> >100338365 in pg_toast_16425
> Huh, seems, it's not related to GIN at all... Indexes don't play with toast
> machinery. The single place where this error can occur is a heap_delete() -
> deleting already deleted tuple.
Like you, I would not expect gin_alone_cleanup-4.patch to cause such an error.
I get the impression Jeff has a test case that he had run in many iterations
against the unpatched baseline. I also get the impression that a similar or
smaller number of its iterations against gin_alone_cleanup-4.patch triggered
these two errors (once apiece, or multiple times?). Jeff, is that right? If
so, until we determine the cause, we should assume the cause arrived in
gin_alone_cleanup-4.patch. An error in pointer arithmetic or locking might
corrupt an unrelated buffer, leading to this symptom.
> >I've restarted the test harness with intentional crashes turned off,
> >to see if the problems are related to crash recovery or are more
> >generic than that.
> >
> >I've never seen these particular problems before, so don't have much
> >insight into what might be going on or how to debug it.
Could you describe the test case in sufficient detail for Teodor to reproduce
your results?
> Check my reasoning: In version 4 I added a remebering of tail of pending
> list into blknoFinish variable. And when we read page which was a tail on
> cleanup start then we sets cleanupFinish variable and after cleaning that
> page we will stop further cleanup. Any insert caused during cleanup will be
> placed after blknoFinish (corner case: in that page), so, vacuum should not
> miss tuples marked as deleted.
Would any hacker volunteer to review Teodor's reasoning here?
Thanks,
nm
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2016-04-22 06:46:51 | Re: VS 2015 support in src/tools/msvc |
Previous Message | Amit Kapila | 2016-04-22 05:36:43 | Re: max_parallel_degree > 0 for 9.6 beta |