Quick Links

Re: Cache Hash Index meta page.

From:	Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: Cache Hash Index meta page.
Date:	2016-12-16 10:16:17
Message-ID:	CAD__OuhWbWt=t-nXGMGdvdqqm7X8aNfoh=AA13huVtx80S8+9g@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Thanks Robert, I have tried to address all of the comments,
On Tue, Dec 6, 2016 at 2:20 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> + bucket = _hash_hashkey2bucket(hashkey, metap->hashm_maxbucket,
> metap->hashm_highmask,
> metap->hashm_lowmask);
>
> This hunk appears useless.
>
> + metap = (HashMetaPage)rel->rd_amcache;
>
> Whitespace.
>

Fixed.

>
> + /* Cache the metapage data for next time*/
>
> Whitespace.
>

Fixed.

> + /* Check if this bucket is split after we have cached the
> metapage.
>
> Whitespace.
>

Fixed.

>
> Shouldn't _hash_doinsert() be using the cache, too?
>

Yes, we have an opportunity there, added same in code. But one difference
is at the end at-least once we need to read the meta page to split and/or
write. Performance improvement might not be as much as read-only.

I did some pgbench simple-update tests for same, with below changes.

- "alter table pgbench_branches add primary key (bid)",
- "alter table pgbench_tellers add primary key (tid)",
- "alter table pgbench_accounts add primary key (aid)"
+ "create index pgbench_branches_bid on pgbench_branches
using hash (bid)",
+ "create index pgbench_tellers_tid on pgbench_tellers using
hash (tid)",
+ "create index pgbench_accounts_aid on pgbench_accounts
using hash (aid)"

And, removed all reference keys. But I see no improvements; I will further
do benchmarking for copy command and report same.

Clients

After Meta page cache

Base Code

%imp

2276.151633

2304.253611

-1.2195696631

36816.596513

36439.552652

1.0347104549

50943.763133

51005.236973

-0.120524565

128

49156.980457

48458.275106

1.4418700407
Above result is median of three runs, and each run is for 30mins.

*Postgres Server settings:*
./postgres -c shared_buffers=8GB -N 200 -c min_wal_size=15GB -c
max_wal_size=20GB -c checkpoint_timeout=900 -c maintenance_work_mem=1GB -c
checkpoint_completion_target=0.9

*pgbench settings:*
scale_factor = 300 (so database fits in shared_buffer)
Mode = -M prepared -N (prepared simple-update).

*Machine used:*
power2 same as described as above.

> I think it's probably not a good idea to cache the entire metapage. The
> only things that you are "really" trying to cache, I think, are
> hashm_maxbucket, hashm_lowmask, and hashm_highmask. The entire
> HashPageMetaData structure is 696 bytes on my machine, and it doesn't
> really make sense to copy the whole thing into memory if you only need 16
> bytes of it. It could even be dangerous -- if somebody tries to rely on
> the cache for some other bit of data and we're not really guaranteeing that
> it's fresh enough for that.
>
> I'd suggest defining a new structure HashMetaDataCache with members
> hmc_maxbucket, hmc_lowmask, and hmc_highmask. The structure can have a
> comment explaining that we only care about having the data be fresh enough
> to test whether the bucket mapping we computed for a tuple is still
> correct, and that for that to be the case we only need to know whether a
> bucket has suffered a new split since we last refreshed the cache.
>

It is not only hashm_maxbucket, hashm_lowmask, and hashm_highmask (3
uint32s) but we also need

*uint32 hashm_spares[HASH_MAX_SPLITPOINTS],* for bucket number to
block mapping in "BUCKET_TO_BLKNO(metap, bucket)".

Note : #define HASH_MAX_SPLITPOINTS 32, so it is (3*uint32 + 32*uint32) =
35*4 = 140 bytes.

>
> The comments in this patch need some work, e.g.:
>
> -
> + oopaque->hasho_prevblkno = maxbucket;
>
> No comment?
>
>
I have tried to improve commenting part in the new patch.

Apart from this, there seems to be some base bug in _hash_doinsert().
+ * XXX this is useless code if we are only storing hash keys.

+ */

+ if (itemsz > HashMaxItemSize((Page) metap))

+ ereport(ERROR,

+ (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),

+ errmsg("index row size %zu exceeds hash maximum %zu",

+ itemsz, HashMaxItemSize((Page) metap)),

+ errhint("Values larger than a buffer page cannot be
indexed.")));

"metap" (HashMetaPage) and Page are different data structure their member
types are not in sync, so should not typecast blindly as above. I think we
should remove this part of the code as we only store hash keys. So I have
removed same but kept the assert below as it is.
Also, there was a bug in the previous patch. I was not releasing the bucket
page lock if cached metadata is old, now same is fixed.

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Attachment	Content-Type	Size
cache_hash_index_meta_page_07.patch	application/octet-stream	18.8 KB

In response to

Re: Cache Hash Index meta page. at 2016-12-05 20:50:02 from Robert Haas

Responses

Re: Cache Hash Index meta page. at 2016-12-19 22:21:28 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Antonin Houska	2016-12-16 10:24:30	Re: Slow I/O can break throttling of base backup
Previous Message	Geoff Winkless	2016-12-16 09:58:12	Re: jsonb problematic operators