Quick Links

Re: why do hash index builds use smgrextend() for new splitpoint pages

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc:	Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: why do hash index builds use smgrextend() for new splitpoint pages
Date:	2022-02-25 03:24:29
Message-ID:	CAA4eK1+e5g9utoXV5K4u5QXfwDm5eF5AXMjQkvY357M3Y9gipA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, Feb 25, 2022 at 4:41 AM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
>
> I'm trying to understand why hash indexes are built primarily in shared
> buffers except when allocating a new splitpoint's worth of bucket pages
> -- which is done with smgrextend() directly in _hash_alloc_buckets().
>
> Is this just so that the value returned by smgrnblocks() includes the
> new splitpoint's worth of bucket pages?
>
> All writes of tuple data to pages in this new splitpoint will go
> through shared buffers (via hash_getnewbuf()).
>
> I asked this and got some thoughts from Robert in [1], but I still don't
> really get it.
>
> When a new page is needed during the hash index build, why can't
> _hash_expandtable() just call ReadBufferExtended() with P_NEW instead of
> _hash_getnewbuf()? Does it have to do with the BUCKET_TO_BLKNO mapping?
>

We allocate the chunk of pages (power-of-2 groups) at the time of
split which allows them to appear consecutively in an index. This
helps us to compute the physical block number from bucket number
easily (BUCKET_TO_BLKNO mapping) with some minimal control
information.

--
With Regards,
Amit Kapila.

In response to

why do hash index builds use smgrextend() for new splitpoint pages at 2022-02-24 23:10:59 from Melanie Plageman

Responses

Re: why do hash index builds use smgrextend() for new splitpoint pages at 2022-02-25 03:31:15 from Amit Kapila
Re: why do hash index builds use smgrextend() for new splitpoint pages at 2022-02-25 21:31:23 from Melanie Plageman

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Amit Kapila	2022-02-25 03:27:24	Re: pg_stat_get_replication_slot and pg_stat_get_subscription_worker incorrectly marked as proretset
Previous Message	Maciek Sakrejda	2022-02-25 02:03:55	Re: Add id's to various elements in protocol.sgml