From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Bruce Momjian <bruce(at)momjian(dot)us>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: UNDO and in-place update |
Date: | 2016-11-27 23:20:03 |
Message-ID: | CA+TgmoZco3X4BswKgZmAShJcGbnwcr06=LP7w8+WWFGtGz_zPA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Nov 26, 2016 at 10:49 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Fri, Nov 25, 2016 at 11:23 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>> On Thu, Nov 24, 2016 at 12:23:28PM -0500, Robert Haas wrote:
>>> I agree up to a point. I think we need to design our own system as
>>> well as we can, not just copy what others have done. For example, the
>>> design I sketched will work with all of PostgreSQL's existing index
>>> types. You need to modify each AM in order to support in-place
>>> updates when a column indexed by that AM has been modified, and that's
>>> probably highly desirable, but it's not a hard requirement.
>>
>> I feel you are going to get into the problem of finding the index entry
>> for the old row --- the same thing that is holding back more aggressive
>> WARM updates.
>>
>
> I think I see a problem in that regards when the index column in
> updated twice with the same value. For example,
>
> table - test_undo(c1 int, c2 char(10));
> index - idx_tu on test_undo(c1);
>
> Step-1
> Insert (2, 'abc')
> Heap - 2, abc
> Index - 2
> Commit;
>
> Step -2
> Update (3,def) where c1 = 2;
> Heap - 3, def
> Index - 3 (another value will be 2)
> Undo - 2, abc
> Commit;
>
> At this point a new query which scans the index for value 2 will reach
> to the tuple (3,def). As scan started after all commits, tuple
> (3,def) appears okay, but it can compare the key value in the tuple to
> detect that this is not the right tuple. So, all is well till here.
>
> Step -3
> Update (2) where c1 = 3;
> Heap - 2, def
> Index - 2 (another values will be 3,2)
> Undo - 3;
> Commit
>
> At this point, index scan for value 2 will find index tuple of step-1
> (2) and will conclude 2,def as a right tuple, but actually, that is
> wrong as the step-1 (2) index tuple should not be visible to the user.
> Do you also this as a problem or am I missing something? If this a
> problem, then I think we need some form of delete marking system for
> the index, probably transaction visibility information as well.
Well, my original email did contain a discussion of the need for
delete-marking. I said that you couldn't do in-place updates when
indexed columns were modified unless the index AMs had support for
delete-marking, which is the same point you are making here. However,
I agree that the case where the indexed column gets set back to a
previous value while the old index tuple for that value still exists
is particularly tricky. I think that what we need to do there is
avoid inserting an index entry for a given value/TID pair if an index
entry for that value/TID pair already exists.
That's a little different from your analysis. In your step-3
analysis, you say that an index scan for 2 will find the step-1 tuple,
but that's not really right. The index scan will find the index tuple
which points to the whole chain of tuples, step-3 => step-2 => step-1,
and it will decide which heap tuple from that chain the user can see
based on the user's snapshot. That's OK, but we're in real trouble if
step-3 inserted an additional index tuple pointing to that chain
rather than simply noting that one already exists. If it adds an
additional one, then we'll visit two index tuples for that TID. Each
time, the visibility information in UNDO will cause us to return the
correct tuple, but we've erred by returning it twice (once per index
entry) rather than just once.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Jim Nasby | 2016-11-27 23:35:52 | Re: Wrong order of tests in findDependentObjects() |
Previous Message | Petr Jelinek | 2016-11-27 22:54:19 | Re: Logical Replication WIP |