From: | Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com> |
---|---|
To: | Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Page Scan Mode in Hash Index |
Date: | 2017-03-21 18:18:10 |
Message-ID: | 0aaa9067-1e3c-868e-46fe-611a797ffd88@redhat.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 02/14/2017 12:27 AM, Ashutosh Sharma wrote:
> Currently, Hash Index scan works tuple-at-a-time, i.e. for every
> qualifying tuple in a page, it acquires and releases the lock which
> eventually increases the lock/unlock traffic. For example, if an index
> page contains 100 qualified tuples, the current hash index scan has to
> acquire and release the lock 100 times to read those qualified tuples
> which is not good from performance perspective and it also impacts
> concurency with VACUUM.
>
> Considering above points, I would like to propose a patch that allows
> hash index scan to work in page-at-a-time mode. In page-at-a-time
> mode, once lock is acquired on a target bucket page, the entire page
> is scanned and all the qualified tuples are saved into backend's local
> memory. This reduces the lock/unlock calls for retrieving tuples from
> a page. Moreover, it also eliminates the problem of re-finding the
> position of the last returned index tuple and more importanly it
> allows VACUUM to release lock on current page before moving to the
> next page which eventually improves it's concurrency with scan.
>
> Attached patch modifies hash index scan code for page-at-a-time mode.
> For better readability, I have splitted it into 3 parts,
>
Due to the commits on master these patches applies with hunks.
The README should be updated to mention the use of page scan.
hash.h needs pg_indent.
> 1) 0001-Rewrite-hash-index-scans-to-work-a-page-at-a-time.patch: this
> patch rewrites the hash index scan module to work in page-at-a-time
> mode. It basically introduces two new functions-- _hash_readpage() and
> _hash_saveitem(). The former is used to load all the qualifying tuples
> from a target bucket or overflow page into an items array. The latter
> one is used by _hash_readpage to save all the qualifying tuples found
> in a page into an items array. Apart from that, this patch bascially
> cleans _hash_first(), _hash_next and hashgettuple().
>
For _hash_next I don't see this - can you explain ?
+ *
+ * On failure exit (no more tuples), we release pin and set
+ * so->currPos.buf to InvalidBuffer.
+ * Returns true if any matching items are found else returns false.
s/Returns/Return/g
> 2) 0002-Remove-redundant-function-_hash_step-and-some-of-the.patch:
> this patch basically removes the redundant function _hash_step() and
> some of the unused members of HashScanOpaqueData structure.
>
Looks good.
> 3) 0003-Improve-locking-startegy-during-VACUUM-in-Hash-Index.patch:
> this patch basically improves the locking strategy for VACUUM in hash
> index. As the new hash index scan works in page-at-a-time, vacuum can
> release the lock on previous page before acquiring a lock on the next
> page, hence, improving hash index concurrency.
>
+ * As the new hash index scan work in page at a time mode,
Remove 'new'.
> I have also done the benchmarking of this patch and would like to
> share the results for the same,
>
> Firstly, I have done the benchmarking with non-unique values and i
> could see a performance improvement of 4-7%. For the detailed results
> please find the attached file 'results-non-unique values-70ff', and
> ddl.sql, test.sql are test scripts used in this experimentation. The
> detail of non-default GUC params and pgbench command are mentioned in
> the result sheet. I also did the benchmarking with unique values at
> 300 and 1000 scale factor and its results are provided in
> 'results-unique-values-default-ff'.
>
I'm seeing similar results, and especially with write heavy scenarios.
Best regards,
Jesper
From | Date | Subject | |
---|---|---|---|
Next Message | Ashutosh Sharma | 2017-03-21 18:19:33 | Re: segfault in hot standby for hash indexes |
Previous Message | Robert Haas | 2017-03-21 18:17:17 | Re: Patch: Write Amplification Reduction Method (WARM) |