Quick Links

Re: WIP: Fast GiST index build

From:	Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: WIP: Fast GiST index build
Date:	2011-08-11 10:21:29
Message-ID:	CAPpHfdvhoYKUUCVaC-yFbDaE5j2QTa94cVgYuKnzX4xwWUjSTg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wed, Aug 10, 2011 at 11:45 PM, Heikki Linnakangas <
heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:

> unloadNodeBuffers() is now dead code.
>
processEmptyingStack calls it.

LEAF_PAGES_STATS_* are unused now.

Removed.

> Should avoid calling smgrnblocks() on every tuple, the overhead of that
> could add up.

Now calling at each BUFFERING_MODE_SWITCH_CHECK_STEP(256) tuples.

> I wonder, how hard would it be to merge gistBufferingBuildPlaceToPage(**)
> with the gistplacetopage() function used in the main codepath? There's very
> little difference between them, and it would be nice to maintain just one
> function. At the very least I think there should be a comment in both along
> the lines of "NOTE: if you change this function, make sure you update XXXX
> (the other function) as well!"
>
I doubt they can be effectively merged, but will try.

> In gistbuild(), in the final emptying stage, there's this special-case
> handling for the root block before looping through the buffers in the
> buffersOnLevels lists:
>
> for (;;)
>> {
>> nodeBuffer = getNodeBuffer(gfbb,
>> &buildstate.giststate, GIST_ROOT_BLKNO,
>>
>> InvalidOffsetNumber, NULL, false);
>> if (!nodeBuffer || nodeBuffer->blocksCount <= 0)
>> break;
>> MemoryContextSwitchTo(gfbb->**context);
>> gfbb->bufferEmptyingStack = lcons(nodeBuffer,
>> gfbb->bufferEmptyingStack);
>> MemoryContextSwitchTo(**buildstate.tmpCtx);
>> processEmptyingStack(&**buildstate.giststate,
>> &insertstate);
>> }
>>
>
> What's the purpose of that? Wouldn't the loop through buffersOnLevels lists
> take care of the root node too?
>
I was worried about node buffer deletion from list while scanning that
list gistbuild(). That's why I avoided deletion from lists.
Now I've added additional check for root node in loop over list.

> The calculations in initBuffering() desperately need comments. As does the
> rest of the code too, but the heuristics in that function are particularly
> hard to understand without some explanation.

Some comments were added. I'm working on more of them.

------
With best regards,
Alexander Korotkov.

Attachment	Content-Type	Size
gist_fast_build-0.13.0.patch.gz	application/x-gzip	22.8 KB

In response to

Re: WIP: Fast GiST index build at 2011-08-10 19:45:18 from Heikki Linnakangas

Responses

Re: WIP: Fast GiST index build at 2011-08-12 15:04:55 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Heikki Linnakangas	2011-08-11 10:28:58	Re: WIP: Fast GiST index build
Previous Message	Andres Freund	2011-08-11 10:03:52	Re: "pgstat wait timeout" warnings