From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, amul sul <sulamul(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com> |
Subject: | Re: wal_consistency_checking reports an inconsistency on master branch |
Date: | 2018-05-01 20:33:26 |
Message-ID: | 20180501203326.i5iibsp2cjfj5jkf@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2018-04-30 22:08:46 -0700, Andres Freund wrote:
> On 2018-04-23 07:58:30 -0700, Andres Freund wrote:
> > On 2018-04-23 13:22:21 +0300, Heikki Linnakangas wrote:
> > > On 13/04/18 13:08, Michael Paquier wrote:
> > > > On Fri, Apr 13, 2018 at 02:15:35PM +0530, amul sul wrote:
> > > > > I have looked into this and found that the issue is in heap_xlog_delete -- we
> > > > > have missed to set the correct offset number from the target_tid when
> > > > > XLH_DELETE_IS_PARTITION_MOVE flag is set.
> > > >
> > > > Oh, this looks good to me. So when a row was moved across partitions
> > > > this could have caused incorrect tuple references on a standby, which
> > > > could have caused corruptions.
> > >
> > > Hmm. So, the problem was that HeapTupleHeaderSetMovedPartitions() only sets
> > > the block number to InvalidBlockNumber, and leaves the offset number
> > > unchanged. WAL replay didn't preserve the offset number, so the master and
> > > the standby had a different offset number in the ctid.
> >
> > Right.
> >
> > > Why does HeapTupleHeaderSetMovedPartitions() leave the offset number
> > > unchanged? The old offset number is meaningless without the block number.
> > > Also, bits and magic values in the tuple header are scarce. We're
> > > squandering a whole range of values in the ctid, everything with
> > > ip_blkid==InvalidBlockNumber, to mean "moved to different partition", when a
> > > single value would suffice.
> >
> > Yes, I agree on that.
> >
> >
> > > I kept using InvalidBlockNumber there, so ItemPointerIsValid() still
> > > considers those item pointers as invalid. But my gut feeling is actually
> > > that it would be better to use e.g. 0 as the block number, so that these
> > > item pointers would appear valid. Again, to follow the precedent of
> > > speculative insertion tokens. But I'm not sure if there was some
> > > well-thought-out reason to make them appear invalid. A comment on that would
> > > be nice, at least.
> >
> > That seems risky to me. We want something that stops EPQ style chasing
> > without running into asserts for invalid offsets...
>
> Heikki, would you rather apply this yourself or have me do it?
I pushed that now.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2018-05-01 20:48:49 | Re: A few warnings on Windows |
Previous Message | Andrew Dunstan | 2018-05-01 20:29:25 | Re: Support Python 3 tests under MSVC |