Re: Page Scan Mode in Hash Index

From: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Page Scan Mode in Hash Index
Date: 2017-03-13 05:46:03
Message-ID: CAE9k0PmhjeTwjcM7NRSYhfkaDi69XNwzyzuQtONJiAdgAkKSww@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

>
> I've assigned to review this patch.
> At first, I'd like to notice that I like idea and general design.
> Secondly, patch set don't apply cleanly to master. Please, rebase it.

Thanks for showing your interest towards this patch. I would like to
inform that this patch has got dependency on patch for 'Write Ahead
Logging in hash index - [1]' and 'Microvacuum support in hash index
[1]'. Hence, until above two patches becomes stable I may have to keep
on rebasing this patch. However, I will try to share you the updated
patch asap.

>
>
> On Tue, Feb 14, 2017 at 8:27 AM, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote:
>>
>> 1) 0001-Rewrite-hash-index-scans-to-work-a-page-at-a-time.patch: this
>> patch rewrites the hash index scan module to work in page-at-a-time
>> mode. It basically introduces two new functions-- _hash_readpage() and
>> _hash_saveitem(). The former is used to load all the qualifying tuples
>> from a target bucket or overflow page into an items array. The latter
>> one is used by _hash_readpage to save all the qualifying tuples found
>> in a page into an items array. Apart from that, this patch bascially
>> cleans _hash_first(), _hash_next and hashgettuple().
>
>
> I see that forward and backward scan cases of function _hash_readpage contain a lot of code duplication
> Could you please refactor this function to have less code duplication?

Sure, I will try to avoid the code duplication as much as possible.

>
> Also, I wonder if you have a special idea behind inserting data in test.sql by 1002 separate SQL statements.
> INSERT INTO con_hash_index_table (keycol) SELECT a FROM GENERATE_SERIES(1, 1000) a;
>
> You can achieve the same result by execution of single SQL statement.
> INSERT INTO con_hash_index_table (keycol) SELECT (a - 1) % 1000 + 1 FROM GENERATE_SERIES(1, 1002000) a;
> Unless you have some special idea of doing this in 1002 separate transactions.

There is no reason for having so many INSERT statements in test.sql
file. I think it would be better to replace it with single SQL
statement. Thanks.

[1]- https://www.postgresql.org/message-id/CAA4eK1KibVzgVETVay0%2BsiVEgzaXnP5R21BdWiK9kg9wx2E40Q%40mail.gmail.com
[2]- https://www.postgresql.org/message-id/CAE9k0PkRSyzx8dOnokEpUi2A-RFZK72WN0h9DEMv_ut9q6bPRw%40mail.gmail.com

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Sharma 2017-03-13 05:49:22 Re: PATCH: pageinspect / add page_checksum and bt_page_items(bytea)
Previous Message Amit Langote 2017-03-13 05:41:13 Re: Bug in get_partition_for_tuple