From: | Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | David Steele <david(at)pgmasters(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [POC] A better way to expand hash indexes. |
Date: | 2017-03-26 05:56:12 |
Message-ID: | CAD__Oug6QTFHBbV=86bc0pxCegJ5NP5EwY8yQgBHDO8G3VbSbA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Thanks, Amit for the review.
On Sat, Mar 25, 2017 at 7:03 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> I think one-dimensional patch has fewer places to touch, so that looks
> better to me. However, I think there is still hard coding and
> assumptions in code which we should try to improve.
Great!, I will continue with spares 1-dimensional improvement.
>
> 1.
> + /*
> + * The first 4 bucket belongs to first splitpoint group 0. And since group
> + * 0 have 4 = 2^2 buckets, we double them in group 1. So total buckets
> + * after group 1 is 8 = 2^3. Then again at group 2 we add another 2^3
> + * buckets to double the total number of buckets, which will become 2^4. I
> + * think by this time we can see a pattern which say if num_bucket > 4
> + * splitpoint group = log2(num_bucket) - 2
> + */
>
> + if (num_bucket <= 4)
> + splitpoint_group = 0; /* converted to base 0. */
> + else
> + splitpoint_group = _hash_log2(num_bucket) - 2;
>
> This patch defines split point group zero has four buckets and based
> on that above calculation is done. I feel you can define it like
> #define Buckets_First_Split_Group 4 and then use it in above code.
> Also, in else part number 2 looks awkward, can we define it as
> log2_buckets_first_group = _hash_log2(Buckets_First_Split_Group) or
> some thing like that. I think that way code will look neat. I don't
> like the way above comment is worded even though it is helpful to
> understand the calculation. If you want, then you can add such a
> comment in file header, here it looks out of place.
I have removed the comments. And, defined a new macro which maps
bucket to SPLIT GROUP
#define BUCKET_TO_SPLITPOINT_GRP(num_bucket) \
((num_bucket <= Buckets_First_Split_Group) ? 0 : \
(_hash_log2(num_bucket) - 2))
I could not find a way to explain why minus 2? better than " The
splitpoint group of a given bucket can be taken as
(_hash_log2(bucket) - 2). Subtracted by 2 because each group have 2 ^
(x + 2) buckets.". Now I have added those with existing comments I
think that should make it little better.
Adding comments about spares array in hashutils.c's file header did
not appear right to me. I think README has some details about same.
> 2.
> +power-of-2 groups, called "split points" in the code. That means on every new
> +split points we double the existing number of buckets.
>
> split points/split point
Fixed.
> 3.
> + * which phase of allocation the bucket_num belogs to with in the group.
>
> /belogs/belongs
Fixed
--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
expand_hashbucket_efficiently_07.patch | application/octet-stream | 19.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Nikhil Sontakke | 2017-03-26 07:50:47 | Re: Speedup twophase transactions |
Previous Message | Peter Geoghegan | 2017-03-26 05:33:51 | Re: WIP: [[Parallel] Shared] Hash |