Re: Patch: Write Amplification Reduction Method (WARM)

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Jaime Casanova <jaime(dot)casanova(at)2ndquadrant(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch: Write Amplification Reduction Method (WARM)
Date: 2017-03-28 10:37:52
Message-ID: CAA4eK1J455T2BBH16rj2ez57koVaUrSWfaMecXzNPfu6eEXg7A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 27, 2017 at 2:19 PM, Pavan Deolasee
<pavan(dot)deolasee(at)gmail(dot)com> wrote:
>
>
> On Fri, Mar 24, 2017 at 11:49 PM, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
> wrote:
>>
>>
>>
>> On Fri, Mar 24, 2017 at 6:46 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
>> wrote:
>>>
>>>
>>>
>>> I was worried for the case if the index is created non-default
>>> collation, will the datumIsEqual() suffice the need. Now again
>>> thinking about it, I think it will because in the index tuple we are
>>> storing the value as in heap tuple. However today it occurred to me
>>> how will this work for toasted index values (index value >
>>> TOAST_INDEX_TARGET). It is mentioned on top of datumIsEqual() that it
>>> probably won't work for toasted values. Have you considered that
>>> point?
>>>
>>
>> No, I haven't and thanks for bringing that up. And now that I think more
>> about it and see the code, I think the naive way of just comparing index
>> attribute value against heap values is probably wrong. The example of
>> TOAST_INDEX_TARGET is one such case, but I wonder if there are other varlena
>> attributes that we might store differently in heap and index. Like
>> index_form_tuple() ->heap_fill_tuple seem to some churning for varlena. It's
>> not clear to me if index_get_attr will return the values which are binary
>> comparable to heap values.. I wonder if calling index_form_tuple on the heap
>> values, fetching attributes via index_get_attr on both index tuples and then
>> doing a binary compare is a more robust idea. Or may be that's just
>> duplicating efforts.
>>
>> While looking at this problem, it occurred to me that the assumptions made
>> for hash indexes are also wrong :-( Hash index has the same problem as
>> expression indexes have. A change in heap value may not necessarily cause a
>> change in the hash key. If we don't detect that, we will end up having two
>> hash identical hash keys with the same TID pointer. This will cause the
>> duplicate key scans problem since hashrecheck will return true for both the
>> hash entries. That's a bummer as far as supporting WARM for hash indexes is
>> concerned, unless we find a way to avoid duplicate index entries.
>>
>
> Revised patches are attached.
>

Noted few cosmetic issues in 0005_warm_updates_v21:

1.
pruneheap.c(939): warning C4098: 'heap_get_root_tuples' : 'void'
function returning a value

2.
+ * HCWC_WARM_UPDATED_TUPLE - a tuple with HEAP_WARM_UPDATED is found somewhere
+ * in the chain. Note that when a tuple is WARM
+ * updated, both old and new versions are marked
+ * with this flag/
+ *
+ * HCWC_WARM_TUPLE - a tuple with HEAP_WARM_TUPLE is found somewhere in
+ * the chain.
+ *
+ * HCWC_CLEAR_TUPLE - a tuple without HEAP_WARM_TUPLE is found somewhere in
+ * the chain.

Description of all three flags is same.

3.
+ * HCWC_WARM_UPDATED_TUPLE - a tuple with HEAP_WARM_UPDATED is found somewhere
+ * in the chain. Note that when a tuple is WARM
+ * updated, both old and new versions are marked
+ * with this flag/

Spurious '/' at end of line.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2017-03-28 10:50:45 Re: pg_dump emits ALTER TABLE ONLY partitioned_table
Previous Message Amit Langote 2017-03-28 10:35:54 Re: Documentation improvements for partitioning