Quick Links

Re: B-Tree index builds, CLUSTER, and sortsupport

From:	Peter Geoghegan <pg(at)heroku(dot)com>
To:	Andreas Karlsson <andreas(at)proxel(dot)se>
Cc:	Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: B-Tree index builds, CLUSTER, and sortsupport
Date:	2014-11-06 01:30:43
Message-ID:	CAM3SWZReiNJ1P-GXSLzqnLvxmaoCw48G8wBHc84qWaOe_-_KCQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Thanks for the review.

On Wed, Nov 5, 2014 at 4:33 PM, Andreas Karlsson <andreas(at)proxel(dot)se> wrote:
> I looked at the changes to the code. The new code is clean and there is more
> code re-use and improved readability. On possible further improvement would
> be to move the preparation of SortSupport to a common function since this is
> done three time in the code.

The idea there is to have more direct control of sortsupport. With the
abbreviated keys patch, abbreviation occurs based on a decision made
by tuplesort.c. I can see why you'd say that, but I prefer to keep
initialization of sortsupport structs largely concentrated in
tuplesort.c, and more or less uniform regardless of the tuple-type
being sorted.

> I did some simple benchmarks by adding indexes to temporary tables and could
> see improvements of around 10% in index build time. So it gives a nice, but
> not amazing, performance improvement.

Cool.

> Is there any case where we should expect any greater performance
> improvement?

The really compelling case is abbreviated keys - as you probably know,
there is a patch that builds on this patch, and the abbreviated keys
patch, so that B-Tree builds and CLUSTER can use abbreviated keys too.
That doesn't really have much to do with this patch, though. The
important point is that heap tuple sorting (with a query that has no
client overhead, and involves one big sort) and B-Tree index creation
both have their tuplesort as a totally dominant cost. The improvements
for each case should be quite comparable, which is good (except, as
noted in my opening e-mail, when heap/datum tuplesorting can use the
onlyKey optimization, while B-Tree/CLUSTER tuplesorting cannot).

--
Peter Geoghegan

In response to

Re: B-Tree index builds, CLUSTER, and sortsupport at 2014-11-06 00:33:29 from Andreas Karlsson

Responses

Re: B-Tree index builds, CLUSTER, and sortsupport at 2014-11-06 10:51:45 from Andreas Karlsson
Re: B-Tree index builds, CLUSTER, and sortsupport at 2014-11-07 20:52:04 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Robert Haas	2014-11-06 01:38:36	Re: ltree::text not immutable?
Previous Message	Jim Nasby	2014-11-06 01:06:15	Re: Repeatable read and serializable transactions see data committed after tx start