From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Bruce Momjian <bruce(at)momjian(dot)us>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: UNDO and in-place update |
Date: | 2016-12-06 16:05:19 |
Message-ID: | CA+TgmoZ4Yh4BwGMZG33PM0KmxzToYmRPO=ynj6bfQ8RKD_=x=A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Dec 5, 2016 at 4:49 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> I can very well understand the reason for not doing so (IIUC, it is
> about complexity and time to get it right when we are already trying
> to solve one big and complex problem of the system), but saying most
> of the benefit can be realized by having simple optimization seems
> less convincing. I think having simple optimizations won't solve the
> multi-pass vacuum problem. However, having the visibility information
> in the index will solve that and avoid index bloat much aggressively.
I don't think that multi-pass vacuum would exist in this system,
regardless of any of the details we're talking about here. If you are
doing in-place updates of tuples and changing the indexed columns,
then you can't rely on the current system for removing index entries
at all. The indexed entries don't point to a "dead" line pointer;
they point to a tuple that no longer matches the index entry.
>> 3. Otherwise, you need to look at the heap page. But if the heap
>> page's UNDO pointer is invalid, then the whole page is all-visible, so
>> you're done. (You can also set the visibility map bit and/or the
>> index tuple's definitely-all-visible bit to avoid having to recheck
>> the heap page in the future.)
>>
>> 4. Each tuple will have a bit indicating whether it's been recently
>> modified; that is, whether the transaction whose UNDO log is
>> referenced by the page's UNDO pointer did anything to that tuple; or
>> if the page points to a TPD entry, whether any of the transactions in
>> the TPD modified that tuple. If that bit is not set for that
>> particular tuple, it's all-visible and you're done.
>
> I think 4th optimization is especially good and can cover many cases,
> but I am not sure how do we unset it once it is set. For example,
> once the effect of transaction that has touched that row is
> all-visible, how will we unset it. It might be better to store the
> offset of transaction that has last modified the tuple, if the last
> transaction that has touched the row is visible to snapshot, then we
> don't need to process the UNDO.
If the page has an invalid UNDO pointer and you change it to a valid
UNDO pointer, you unset all the bits at the same time, except of
course for the bit pertaining to the tuple you are modifying. If the
page's UNDO pointer is still valid when you update it, then you set
the bits for any tuples you are modifying for which it is not already
set.
> One more thing that needs some thoughts is how will the rollback work
> if keep just one bit for delete marking. I mean for heap we can
> directly apply from UNDO, but for the index, I think we need to also
> take care of undoing it in some way. For example, search the tree to
> remove the deleting marking from old value and delete marking the new
> value. Now if that is what we need to do then first we need to
> traverse the tree twice which is okay considering rollback is a seldom
> operation, the second challenge would be how to make such an operation
> atomic with respect to WAL.
We don't necessarily need UNDO to clean up the indexes, although it
might be a good idea. It would *work* to just periodically scan the
index for delete-marked items. For each one, visit the heap and see
if there's a version of that tuple present in the heap or current UNDO
that matches the index entry. If not, remove the index entry.
However, we could try to be more aggressive: when you UNDO an UPDATE
or DELETE in the heap, also go search the index and remove any
delete-marked entries that are no longer needed. The advantage of
this is that we'd be more likely to visit the index entries while they
are still in cache. The disadvantage is that it is complex and might
fail if the index is corrupted. We can't have the whole system melt
down in that case; UNDO has to always succeed.
>> The UNDO chain has to remain until all transactions that modified the
>> page are all-visible, not just until the transactions are committed.
>
> Okay, I think this will work if we avoid the second insertion of the
> same value-TID pair in the index as you mentioned above. Without
> that, I think we might not distinguish which of the two rows are
> visible for a transaction to which the latest (step-3) row is visible.
Agreed.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | David Fetter | 2016-12-06 16:19:46 | Re: tzdata 2016j |
Previous Message | Tom Lane | 2016-12-06 15:52:47 | Re: tzdata 2016j |