From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
---|---|
To: | Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com> |
Cc: | Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-committers <pgsql-committers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: pgsql: Optimize btree insertions for common case of increasing values |
Date: | 2018-04-05 17:16:52 |
Message-ID: | CAH2-WzkpJd5TP6uRCqa473mzZCNsVk5KjE3xrs-4=FfFiAMHog@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers |
On Thu, Apr 5, 2018 at 6:27 AM, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com> wrote:
> I came up with something like this:
>
> + /*
> + * If we're here then a pagesplit is needed. We should never
> reach here
> + * if we're using the fastpath since we should have checked
> for all the
> + * required conditions, including the fact that this page
> has enough
> + * freespace. Note that this routine can in-theory deal with
> the
> + * situation where a NULL stack pointer is passed (that's
> what would
> + * happen if the fastpath is taken), like it does during
> crash
> + * recovery. But that path is much slower, defeating the
> very purpose
> + * of the optimisation. The following assertion should
> protect us from
> + * any future code changes that invalidate those
> assumptions.
> + *
> + * Note the fact that whenever we fail to take the fastpath,
> we clear
> + * the cached block. So checking for a valid cached block at
> this point
> + * is enough to decide whether we're in a fastpath or not.
> + */
> + Assert(!(P_ISLEAF(lpageop) &&
> +
> BlockNumberIsValid(RelationGetTargetBlock(rel))));
> +
This looks good to me.
> But then I started thinking can _bt_insertonpg() be called from a code path
> which doesn't start at _bt_doinsert()? I traced one such path
> _bt_getstackbuf() -> _bt_finish_split() -> _bt_insert_parent(), but that can
> only happen in case of a non-leaf page. The assertion which checks for a
> leaf page, should be fine, right?
Right. That's the whole reason for the P_ISLEAF() part of the
assertion. Actually, since a page in an internal page can only happen
as a direct consequence of a page split in one of its children, you
don't even need the P_ISLEAF() part. I'd leave it in, though, since it
makes it clearer which caller you're thinking about.
>> Finally, I'd like to see a small paragraph in the nbtree README, about
>> the high level theory behind this optimization and page recycling. The
>> assumption is that there can only be one non-ignorable leaf rightmost
>> page, and so even a RecentGlobalXmin style interlock isn't required.
>> We cannot fail to detect that our hint was invalidated, because there
>> can only be one such page in the B-Tree at any time. It's possible
>> that the page will be deleted and recycled without a backend's cached
>> page also being detected as invalidated, but only when we happen to
>> recycle a page that once again becomes the rightmost leaf page.
>>
>
> Can I borrow that almost verbatim, after adding details about the
> optimisation itself? Looks like I'm too tired to think sensibly.
I know the feeling.
I think you can take that wording almost verbatim. Obviously it should
refer to the optimization by name, and blend into the surrounding text
in the README. I suggest putting a small section before "On-the-Fly
Deletion Of Index Tuples", but after the main discussion of deletion +
recycling. It's essentially an exception to the general rule, so that
placement makes sense to me.
Thanks
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2018-04-05 18:19:51 | Re: pgsql: New files for MERGE |
Previous Message | Magnus Hagander | 2018-04-05 17:15:00 | pgsql: Fix worker_spi for new parameter to initialize connection |