From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Jaime Casanova <jaime(dot)casanova(at)2ndquadrant(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Patch: Write Amplification Reduction Method (WARM) |
Date: | 2017-03-28 13:34:50 |
Message-ID: | CAA4eK1LnnN9Jr7ZOAYNsfxmdoysds6HSaXMtoY5OXLJVwSx6mw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Mar 28, 2017 at 4:05 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Mon, Mar 27, 2017 at 2:19 PM, Pavan Deolasee
> <pavan(dot)deolasee(at)gmail(dot)com> wrote:
>> On Fri, Mar 24, 2017 at 11:49 PM, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
>> wrote:
>>>
>>>
>>>
>>> On Fri, Mar 24, 2017 at 6:46 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
>>> wrote:
>>>>
>>>>
>>>>
>>>> I was worried for the case if the index is created non-default
>>>> collation, will the datumIsEqual() suffice the need. Now again
>>>> thinking about it, I think it will because in the index tuple we are
>>>> storing the value as in heap tuple. However today it occurred to me
>>>> how will this work for toasted index values (index value >
>>>> TOAST_INDEX_TARGET). It is mentioned on top of datumIsEqual() that it
>>>> probably won't work for toasted values. Have you considered that
>>>> point?
>>>>
>>>
>>> No, I haven't and thanks for bringing that up. And now that I think more
>>> about it and see the code, I think the naive way of just comparing index
>>> attribute value against heap values is probably wrong. The example of
>>> TOAST_INDEX_TARGET is one such case, but I wonder if there are other varlena
>>> attributes that we might store differently in heap and index. Like
>>> index_form_tuple() ->heap_fill_tuple seem to some churning for varlena. It's
>>> not clear to me if index_get_attr will return the values which are binary
>>> comparable to heap values.. I wonder if calling index_form_tuple on the heap
>>> values, fetching attributes via index_get_attr on both index tuples and then
>>> doing a binary compare is a more robust idea. Or may be that's just
>>> duplicating efforts.
>>>
>>> While looking at this problem, it occurred to me that the assumptions made
>>> for hash indexes are also wrong :-( Hash index has the same problem as
>>> expression indexes have. A change in heap value may not necessarily cause a
>>> change in the hash key. If we don't detect that, we will end up having two
>>> hash identical hash keys with the same TID pointer. This will cause the
>>> duplicate key scans problem since hashrecheck will return true for both the
>>> hash entries. That's a bummer as far as supporting WARM for hash indexes is
>>> concerned, unless we find a way to avoid duplicate index entries.
>>>
>>
>> Revised patches are attached. I've added a few more regression tests which
>> demonstrates the problems with compressed and toasted attributes. I've now
>> implemented the idea of creating index tuple from heap values before doing
>> binary comparison using datumIsEqual. This seems to work ok and I see no
>> reason this should not be robust.
>>
>
> As asked previously, can you explain me on what basis are you
> considering it robust? The comments on top of datumIsEqual() clearly
> indicates the danger of using it for toasted values (Also, it will
> probably not give the answer you want if either datum has been
> "toasted".). If you think that because we are using it during
> heap_update to find modified columns, then I think that is not right
> comparison, because there we are comparing compressed value (of old
> tuple) with uncompressed value (of new tuple) which should always give
> result as false.
>
Yet another point to think about the recheck implementation is will it
work correctly when heap tuple itself is toasted. Consider a case
where table has integer and text column (t1 (c1 int, c2 text)) and we
have indexes on c1 and c2 columns. Now, insert a tuple such that the
text column has value more than 2 or 3K which will make it stored in
compressed form in heap (and the size of compressed value is still
more than TOAST_INDEX_TARGET). For such an heap insert, we will pass
the actual value of column to index_form_tuple during index insert.
However during recheck when we fetch the value of c2 from heap tuple
and pass it index tuple, the value is already in compressed form and
index_form_tuple might again try to compress it because the size will
still be greater than TOAST_INDEX_TARGET and if it does so, it might
make recheck fail.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | David Steele | 2017-03-28 14:03:47 | Re: Index usage for elem-contained-by-const-range clauses |
Previous Message | Daniel Verite | 2017-03-28 13:28:52 | Re: PATCH: Batch/pipelining support for libpq |