From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | Peter Geoghegan <pg(at)heroku(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com> |
Subject: | Re: Parallel tuplesort (for parallel B-Tree index creation) |
Date: | 2016-09-06 07:08:52 |
Message-ID: | 286647c2-761a-0e8b-7db9-ea508c2316ba@iki.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 08/16/2016 03:33 AM, Peter Geoghegan wrote:
> I attach a patch that changes how we maintain the heap invariant
> during tuplesort merging. I already mentioned this over on the
> "Parallel tuplesort, partitioning, merging, and the future" thread. As
> noted already on that thread, this patch makes merging clustered
> numeric input about 2.1x faster overall in one case, which is
> particularly useful in the context of a serial final/leader merge
> during a parallel CREATE INDEX. Even *random* non-C-collated text
> input is made significantly faster. This work is totally orthogonal to
> parallelism, though; it's just very timely, given our discussion of
> the merge bottleneck on this thread.
Nice!
> The patch makes tuplesort merging shift down and displace the root
> tuple with the tape's next preread tuple, rather than compacting and
> then inserting into the heap anew. This approach to maintaining the
> heap as tuples are returned to caller will always produce fewer
> comparisons overall. The new approach is also simpler. We were already
> shifting down to compact the heap within the misleadingly named [2]
> function tuplesort_heap_siftup() -- why not instead just use the
> caller tuple (the tuple that we currently go on to insert) when
> initially shifting down (not the heap's preexisting last tuple, which
> is guaranteed to go straight to the leaf level again)? That way, we
> don't need to enlarge the heap at all through insertion, shifting up,
> etc. We're done, and are *guaranteed* to have performed less work
> (fewer comparisons and swaps) than with the existing approach (this is
> the reason for my optimism about getting this stuff out of the way
> early).
Makes sense.
> This new approach is more or less the *conventional* way to maintain
> the heap invariant when returning elements from a heap during k-way
> merging. Our existing approach is convoluted; merging was presumably
> only coded that way because the generic functions
> tuplesort_heap_siftup() and tuplesort_heap_insert() happened to be
> available. Perhaps the problem was masked by unrelated bottlenecks
> that existed at the time, too.
Yeah, this seems like a very obvious optimization. Is there a standard
name for this technique in the literature? I'm OK with "displace", or
perhaps just "replace" or "siftup+insert", but if there's a standard
name for this, let's use that.
- Heikki
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2016-09-06 07:11:52 | Re: pgsql: Add putenv support for msvcrt from Visual Studio 2013 |
Previous Message | Abhijit Menon-Sen | 2016-09-06 07:06:55 | Re: Proposal for changes to recovery.conf API |