From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Cache Hash Index meta page. |
Date: | 2016-08-05 13:33:48 |
Message-ID: | CAA4eK1+_XJpq4aTn_-qVEegUMuFfeK2=PxBr5fshiAHaiA33kg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Aug 4, 2016 at 3:36 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
>> On Fri, Jul 22, 2016 at 3:02 AM, Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com> wrote:
>>> I have created a patch to cache the meta page of Hash index in
>>> backend-private memory. This is to save reading the meta page buffer every
>>> time when we want to find the bucket page. In “_hash_first” call, we try to
>>> read meta page buffer twice just to make sure bucket is not split after we
>>> found bucket page. With this patch meta page buffer read is not done, if the
>>> bucket is not split after caching the meta page.
>
> Is this really safe? The metapage caching in btree is all right because
> the algorithm is guaranteed to work even if it starts with a stale idea of
> where the root page is. I do not think the hash code is equally robust
> about stale data in its metapage.
>
I think stale data in metapage could only cause problem if it leads to
a wrong calculation of bucket based on hashkey. I think that
shouldn't happen. It seems to me that the safety comes from the fact
that required fields (lowmask/highmask) to calculate the bucket won't
be changed more than once without splitting the current bucket (which
we are going to scan). Do you see a problem in hashkey to bucket
mapping (_hash_hashkey2bucket), if the lowmask/highmask are changed by
one additional table half or do you have something else in mind?
>
>> What happens on a system which has gone through pg_upgrade?
>
> That being one reason why. It might be okay if we add another hasho_flag
> bit saying that hasho_prevblkno really contains a maxbucket number, and
> then add tests for that bit everyplace that hasho_prevblkno is referenced.
>
Good idea.
- if (retry)
+ if (opaque->hasho_prevblkno <= metap->hashm_maxbucket)
This code seems to be problematic with respect to upgrades, because
hasho_prevblkno will be initialized to 0xFFFFFFFF without the patch.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Anastasia Lubennikova | 2016-08-05 13:38:29 | Re: Refactoring of heapam code. |
Previous Message | Kevin Grittner | 2016-08-05 13:30:49 | Re: Refactoring of heapam code. |