Re: pgsql: Optimize btree insertions for common case of increasing values

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-committers <pgsql-committers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgsql: Optimize btree insertions for common case of increasing values
Date: 2018-04-05 17:16:52
Message-ID: CAH2-WzkpJd5TP6uRCqa473mzZCNsVk5KjE3xrs-4=FfFiAMHog@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

On Thu, Apr 5, 2018 at 6:27 AM, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com> wrote:
> I came up with something like this:
>
> + /*
> + * If we're here then a pagesplit is needed. We should never
> reach here
> + * if we're using the fastpath since we should have checked
> for all the
> + * required conditions, including the fact that this page
> has enough
> + * freespace. Note that this routine can in-theory deal with
> the
> + * situation where a NULL stack pointer is passed (that's
> what would
> + * happen if the fastpath is taken), like it does during
> crash
> + * recovery. But that path is much slower, defeating the
> very purpose
> + * of the optimisation. The following assertion should
> protect us from
> + * any future code changes that invalidate those
> assumptions.
> + *
> + * Note the fact that whenever we fail to take the fastpath,
> we clear
> + * the cached block. So checking for a valid cached block at
> this point
> + * is enough to decide whether we're in a fastpath or not.
> + */
> + Assert(!(P_ISLEAF(lpageop) &&
> +
> BlockNumberIsValid(RelationGetTargetBlock(rel))));
> +

This looks good to me.

> But then I started thinking can _bt_insertonpg() be called from a code path
> which doesn't start at _bt_doinsert()? I traced one such path
> _bt_getstackbuf() -> _bt_finish_split() -> _bt_insert_parent(), but that can
> only happen in case of a non-leaf page. The assertion which checks for a
> leaf page, should be fine, right?

Right. That's the whole reason for the P_ISLEAF() part of the
assertion. Actually, since a page in an internal page can only happen
as a direct consequence of a page split in one of its children, you
don't even need the P_ISLEAF() part. I'd leave it in, though, since it
makes it clearer which caller you're thinking about.

>> Finally, I'd like to see a small paragraph in the nbtree README, about
>> the high level theory behind this optimization and page recycling. The
>> assumption is that there can only be one non-ignorable leaf rightmost
>> page, and so even a RecentGlobalXmin style interlock isn't required.
>> We cannot fail to detect that our hint was invalidated, because there
>> can only be one such page in the B-Tree at any time. It's possible
>> that the page will be deleted and recycled without a backend's cached
>> page also being detected as invalidated, but only when we happen to
>> recycle a page that once again becomes the rightmost leaf page.
>>
>
> Can I borrow that almost verbatim, after adding details about the
> optimisation itself? Looks like I'm too tired to think sensibly.

I know the feeling.

I think you can take that wording almost verbatim. Obviously it should
refer to the optimization by name, and blend into the surrounding text
in the README. I suggest putting a small section before "On-the-Fly
Deletion Of Index Tuples", but after the main discussion of deletion +
recycling. It's essentially an exception to the general rule, so that
placement makes sense to me.

Thanks
--
Peter Geoghegan

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Andres Freund 2018-04-05 18:19:51 Re: pgsql: New files for MERGE
Previous Message Magnus Hagander 2018-04-05 17:15:00 pgsql: Fix worker_spi for new parameter to initialize connection