Re: pgsql: Optimize btree insertions for common case of increasing values

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-committers <pgsql-committers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgsql: Optimize btree insertions for common case of increasing values
Date: 2018-03-28 09:14:44
Message-ID: CABOikdMCoXG3pMU7zUjO2wdVudvZ-5SzD7McedzRbitN_VvuUQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

On Wed, Mar 28, 2018 at 11:33 AM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:

>
>
> This code can allocate memory in a critical section, since
> RelationSetTargetBlock() can call smgropen():
>
> if (P_ISLEAF(lpageop))
> + {
> xlinfo = XLOG_BTREE_INSERT_LEAF;
> +
> + /*
> + * Cache the block information if we just inserted into the
> + * rightmost leaf page of the index.
> + */
> + if (P_RIGHTMOST(lpageop))
> + RelationSetTargetBlock(rel, BufferGetBlockNumber(buf));
> + }
>
> rd_smgr can be closed by a relcache flush, so it is in fact possible
> that we'll reach smgropen().
>

I think you're right. I propose that we call RelationSetTargetBlock right
after the critical section ends. That should not have any adverse side
effect since we're still holding WRITE lock on the page, it's just an hint
anyways and the next time we shall do a proper recheck.

>
> As I already pointed out, it's unclear why some of these tests are
> used, such as P_INCOMPLETE_SPLIT():
>

> + /*
> + * Check if the page is still the rightmost leaf page, has
> enough
> + * free space to accommodate the new tuple, no split is in
> progress
> + * and the scankey is greater than or equal to the first key
> on the
> + * page.
> + */
> + if (P_ISLEAF(lpageop) && P_RIGHTMOST(lpageop) &&
> + !P_INCOMPLETE_SPLIT(lpageop) &&
> + !P_IGNORE(lpageop) &&
> + (PageGetFreeSpace(page) > itemsz) &&
> + PageGetMaxOffsetNumber(page) >=
> P_FIRSTDATAKEY(lpageop) &&
> + _bt_compare(rel, natts, itup_scankey, page,
> + P_FIRSTDATAKEY(lpageop)) > 0)
>
> The BTP_INCOMPLETE_SPLIT bit means that the page's right sibling's
> downlink is missing, but how can a rightmost page have a right sibling
> in the first place? We don't bother to check that when we already know
> the page is rightmost within _bt_moveright() or _bt_findinsertloc().
>

Hmm. I agree. I am sorry, I missed that comment during the review process.
I checked the code again and BTP_INCOMPLETE_SPLIT is always set along with
the right-link. So the rightmost page, which does not have a right-link,
must not have BTP_INCOMPLETE_SPLIT set. That test is thus redundant (though
it should not give any wrong answers).

Previously, we agreed that P_IGNORE() is required. So I assume no issues
there. The other tests seem required too for us to conclusively decide to
use the cached block.

>
> I also noticed that favorable/favourable is misspelled "favourble".
> Also, "workout" is a noun.
>

Fixed in the attached delta.

Please have a look.

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
pg_btree_target_block_v4_delta.patch application/octet-stream 2.1 KB

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Etsuro Fujita 2018-03-28 12:13:03 Re: pgsql: Fast ALTER TABLE ADD COLUMN with a non-NULL default
Previous Message Konstantin Knizhnik 2018-03-28 07:44:28 Re: pgsql: Allow HOT updates for some expression indexes