From: | The Hermit Hacker <scrappy(at)hub(dot)org> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Alfred Perlstein <bright(at)wintelcom(dot)net>, "Mikheev, Vadim" <vmikheev(at)SECTORBASE(dot)COM>, pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Re: The lightbulb just went on... |
Date: | 2000-10-17 00:54:00 |
Message-ID: | Pine.BSF.4.21.0010162151320.342-100000@thelab.hub.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Something to force a v7.0.3 ... ?
On Mon, 16 Oct 2000, Tom Lane wrote:
> ... with a blinding flash ...
>
> The VACUUM funnies I was complaining about before may or may not be real
> bugs, but they are not what's biting Alfred. None of them can lead to
> the observed crashes AFAICT.
>
> What's biting Alfred is the code that moves a tuple update chain, lines
> 1541 ff in REL7_0_PATCHES. This sets up a pointer to a source tuple in
> "tuple". Then it gets the destination page it plans to move the tuple
> to, and applies vc_vacpage to that page if it hasn't been done already.
> But when we're moving a tuple chain, *it is possible for the destination
> page to be the same as the source page*. Since vc_vacpage applies
> PageRepairFragmentation, all the live tuples on the page may get moved.
> Afterwards, tuple.t_data is out of date and pointing at some random
> chunk of some other tuple. The subsequent copy of the tuple copies
> garbage, which explains Alfred's several crashes in constructing index
> entries for the copied tuple (all of which bombed out from the
> index-build calls at lines 1634 ff, ie, for tuples being moved as part
> of a chain). Once in a while, the obsolete pointer will be pointing at
> the real header of a different tuple --- perhaps even the place where we
> are about to put the copy. This improbable case explains the one
> observed Assert crash in which a copied tuple's HEAP_MOVED_IN bit
> mysteriously got turned off. Reason: it was cleared through the
> old-tuple pointer just after being set via the new-tuple one.
>
> Proof that this is happening can be seen in the core dumps for Alfred's
> index-construction-crash cases: tuple.t_data does not point at the same
> place that the tuple.ip_posid'th page line item points at. This could
> only happen if the page was reshuffled since the tuple pointer was set
> up. The explanation for the Assert crash is a bit of a leap of faith,
> but I feel confident that it's right.
>
> The solution is to do everything we're going to do with the source
> tuple, especially copying it and updating its state, *before* we apply
> vc_vacpage to the destination page. Then we don't care if the source
> gets moved during vc_vacpage.
>
> I will prepare a patch along this line and send it to Alfred for
> testing.
>
> regards, tom lane
>
>
Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy(at)hub(dot)org secondary: scrappy(at){freebsd|postgresql}.org
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2000-10-17 00:56:17 | Re: Re: New relkind for views |
Previous Message | Mark Hollomon | 2000-10-17 00:53:01 | Re: Re: New relkind for views |