From: | Peter Geoghegan <pg(at)heroku(dot)com> |
---|---|
To: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Subject: | Re: Using quicksort for every external sort run |
Date: | 2015-12-07 00:45:02 |
Message-ID: | CAM3SWZSFEUEB-qMQrv95SxaT6cine7avBcJCj3g2tega_Y9yTQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Nov 24, 2015 at 4:46 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> Parallel sort is very important. Robert, Amit and I had a call about
>> this earlier today. We're all in agreement that this should be
>> extended in that direction, and have a rough idea about how it ought
>> to fit together with the parallelism primitives. Parallel sort in 9.6
>> could certainly happen -- that's what I'm aiming for. I haven't really
>> done preliminary research yet; I'll know more in a little while.
>
> Glad to hear it, I was hoping to see that.
As I mentioned just now, I'm working on parallel CREATE INDEX
currently, which seems like a good proving ground for parallel sort,
as its where the majority of really expensive sorts occur. It would be
nice to get parallel-aware sort nodes in 9.6, but I don't think I'll
be able to make that happen in time. The required work in the
optimizer is just too complicated.
The basic idea is that we use the parallel heapam interface, and have
backends sort and write runs as with an external sort (if those runs
are would-be internal sorts, we still write them to tape in the manner
of external sorts). When done, worker processes release memory, but
not tapes, initially. The leader reassembles an in-memory
representation of the tapes that is basically consistent with it
having generated those runs itself (using my new approach to external
sorting). Then, it performs an on-the-fly merge, as before.
At the moment, I have the sorting of runs within workers using the
parallel heapam interface more or less working, with workers dumping
out the runs to tape. I'll work on reassembling the state of the tapes
within the leader in the coming week. It's all still rather rough, but
I think I'll have benchmarks before people start taking time off later
in the month, and possibly even code. Cutting the scope of parallel
sort in 9.6 to only cover parallel CREATE INDEX will make it likely
that I'll be able to deliver something acceptable for that release.
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Jim Nasby | 2015-12-07 01:30:56 | Re: Double linking MemoryContext children |
Previous Message | Peter Geoghegan | 2015-12-07 00:25:23 | Re: Using quicksort for every external sort run |