From: | Peter Geoghegan <pg(at)heroku(dot)com> |
---|---|
To: | Andreas Karlsson <andreas(at)proxel(dot)se> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com> |
Subject: | Re: B-Tree index builds, CLUSTER, and sortsupport |
Date: | 2014-11-06 01:30:43 |
Message-ID: | CAM3SWZReiNJ1P-GXSLzqnLvxmaoCw48G8wBHc84qWaOe_-_KCQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Thanks for the review.
On Wed, Nov 5, 2014 at 4:33 PM, Andreas Karlsson <andreas(at)proxel(dot)se> wrote:
> I looked at the changes to the code. The new code is clean and there is more
> code re-use and improved readability. On possible further improvement would
> be to move the preparation of SortSupport to a common function since this is
> done three time in the code.
The idea there is to have more direct control of sortsupport. With the
abbreviated keys patch, abbreviation occurs based on a decision made
by tuplesort.c. I can see why you'd say that, but I prefer to keep
initialization of sortsupport structs largely concentrated in
tuplesort.c, and more or less uniform regardless of the tuple-type
being sorted.
> I did some simple benchmarks by adding indexes to temporary tables and could
> see improvements of around 10% in index build time. So it gives a nice, but
> not amazing, performance improvement.
Cool.
> Is there any case where we should expect any greater performance
> improvement?
The really compelling case is abbreviated keys - as you probably know,
there is a patch that builds on this patch, and the abbreviated keys
patch, so that B-Tree builds and CLUSTER can use abbreviated keys too.
That doesn't really have much to do with this patch, though. The
important point is that heap tuple sorting (with a query that has no
client overhead, and involves one big sort) and B-Tree index creation
both have their tuplesort as a totally dominant cost. The improvements
for each case should be quite comparable, which is good (except, as
noted in my opening e-mail, when heap/datum tuplesorting can use the
onlyKey optimization, while B-Tree/CLUSTER tuplesorting cannot).
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2014-11-06 01:38:36 | Re: ltree::text not immutable? |
Previous Message | Jim Nasby | 2014-11-06 01:06:15 | Re: Repeatable read and serializable transactions see data committed after tx start |