Re: Patch: Write Amplification Reduction Method (WARM)

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Jaime Casanova <jaime(dot)casanova(at)2ndquadrant(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch: Write Amplification Reduction Method (WARM)
Date: 2017-03-28 03:11:51
Message-ID: CABOikdP5RiFKSiV2ooeN+ikrykh5Y4_s9cs9djt3Yrx62wcsnw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 27, 2017 at 4:45 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:

> On Sat, Mar 25, 2017 at 1:24 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
> > On Fri, Mar 24, 2017 at 11:49 PM, Pavan Deolasee
> > <pavan(dot)deolasee(at)gmail(dot)com> wrote:
> >>
> >> On Fri, Mar 24, 2017 at 6:46 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> >> wrote:
> >>>
> >
> >> While looking at this problem, it occurred to me that the assumptions
> made
> >> for hash indexes are also wrong :-( Hash index has the same problem as
> >> expression indexes have. A change in heap value may not necessarily
> cause a
> >> change in the hash key. If we don't detect that, we will end up having
> two
> >> hash identical hash keys with the same TID pointer. This will cause the
> >> duplicate key scans problem since hashrecheck will return true for both
> the
> >> hash entries.
>
> Isn't it possible to detect duplicate keys in hashrecheck if we
> compare both hashkey and tid stored in index tuple with the
> corresponding values from heap tuple?
>
>
Hmm.. I thought that won't work. For example, say we have a tuple (X, Y, Z)
in the heap with a btree index on X and a hash index on Y. If that is
updated to (X, Y', Z) and say we do a WARM update and insert a new entry in
the hash index. Now if Y and Y' both generate the same hashkey, we will
have exactly similar looking <hashkey, TID> tuples in the hash index
leading to duplicate key scans.

I think one way to solve this is to pass both old and new heap values to
amwarminsert and expect each AM to detect duplicates and avoid creating of
a WARM pointer if index keys are exactly the same (we can do that since
there already exists another index tuple with the same keys pointing to the
same root TID).

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Petr Jelinek 2017-03-28 03:20:44 Re: logical replication launcher crash on buildfarm
Previous Message Tsunakawa, Takayuki 2017-03-28 03:04:08 Re: On How To Shorten the Steep Learning Curve Towards PG Hacking...