From: | Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> |
---|---|
To: | Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Page Scan Mode in Hash Index |
Date: | 2017-03-22 13:32:11 |
Message-ID: | CAE9k0P=QfrT+ZvLrVXDPiVL61FKjc35H2eQHGHaz687n2vCGVQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
>> Attached patch modifies hash index scan code for page-at-a-time mode.
>> For better readability, I have splitted it into 3 parts,
>>
>
> Due to the commits on master these patches applies with hunks.
>
> The README should be updated to mention the use of page scan.
Done. Please refer to the attached v2 version of patch.
>
> hash.h needs pg_indent.
Fixed.
>
>> 1) 0001-Rewrite-hash-index-scans-to-work-a-page-at-a-time.patch: this
>> patch rewrites the hash index scan module to work in page-at-a-time
>> mode. It basically introduces two new functions-- _hash_readpage() and
>> _hash_saveitem(). The former is used to load all the qualifying tuples
>> from a target bucket or overflow page into an items array. The latter
>> one is used by _hash_readpage to save all the qualifying tuples found
>> in a page into an items array. Apart from that, this patch bascially
>> cleans _hash_first(), _hash_next and hashgettuple().
>>
>
> For _hash_next I don't see this - can you explain ?
Sorry, It was wrongly copied from btree code. I have corrected it now. Please
check the attached v2 verison of patch.
>
> + *
> + * On failure exit (no more tuples), we release pin and set
> + * so->currPos.buf to InvalidBuffer.
>
>
> + * Returns true if any matching items are found else returns false.
>
> s/Returns/Return/g
Done.
>
>> 2) 0002-Remove-redundant-function-_hash_step-and-some-of-the.patch:
>> this patch basically removes the redundant function _hash_step() and
>> some of the unused members of HashScanOpaqueData structure.
>>
>
> Looks good.
>
>> 3) 0003-Improve-locking-startegy-during-VACUUM-in-Hash-Index.patch:
>> this patch basically improves the locking strategy for VACUUM in hash
>> index. As the new hash index scan works in page-at-a-time, vacuum can
>> release the lock on previous page before acquiring a lock on the next
>> page, hence, improving hash index concurrency.
>>
>
> + * As the new hash index scan work in page at a time mode,
>
> Remove 'new'.
Done.
>
>> I have also done the benchmarking of this patch and would like to
>> share the results for the same,
>>
>> Firstly, I have done the benchmarking with non-unique values and i
>> could see a performance improvement of 4-7%. For the detailed results
>> please find the attached file 'results-non-unique values-70ff', and
>> ddl.sql, test.sql are test scripts used in this experimentation. The
>> detail of non-default GUC params and pgbench command are mentioned in
>> the result sheet. I also did the benchmarking with unique values at
>> 300 and 1000 scale factor and its results are provided in
>> 'results-unique-values-default-ff'.
>>
>
> I'm seeing similar results, and especially with write heavy scenarios.
Great..!!
--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
0001-Rewrite-hash-index-scans-to-work-a-page-at-a-timev2.patch | application/x-patch | 23.6 KB |
0002-Remove-redundant-function-_hash_step-and-some-of-the.patch | application/x-patch | 8.4 KB |
0003-Improve-locking-startegy-during-VACUUM-in-Hash-Index.patch | application/x-patch | 1.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Surafel Temesgen | 2017-03-22 13:34:53 | Re: New CORRESPONDING clause design |
Previous Message | David Rowley | 2017-03-22 13:19:25 | Re: Patch to improve performance of replay of AccessExclusiveLock |