From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Peter Geoghegan <pg(at)heroku(dot)com> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: Parallel tuplesort (for parallel B-Tree index creation) |
Date: | 2016-10-21 10:55:04 |
Message-ID: | CAA4eK1LiAERyWnr90LKQNCu3AToK0iP=j=mV1hNdkhX7WB-69A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Oct 18, 2016 at 3:48 AM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> On Mon, Oct 17, 2016 at 5:36 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> I read the following paragraph from the Volcano paper just now:
>
> """
> During implementation and benchmarking of parallel sorting, we added
> two more features to exchange. First, we wanted to implement a merge
> network in which some processors produce sorted streams merge
> concurrently by other processors. Volcano’s sort iterator can be used
> to generate a sorted stream. A merge iterator was easily derived from
> the sort module. It uses a single level merge, instead of the cascaded
> merge of runs used in sort. The input of a merge iterator is an
> exchange. Differently from other operators, the merge iterator
> requires to distinguish the input records by their producer. As an
> example, for a join operation it does not matter where the input
> records were created, and all inputs can be accumulated in a single
> input stream. For a merge operation, it is crucial to distinguish the
> input records by their producer in order to merge multiple sorted
> streams correctly.
> """
>
> I don't really understand this paragraph, but thought I'd ask: why the
> need to "distinguish the input records by their producer in order to
> merge multiple sorted streams correctly"? Isn't that talking about
> partitioning, where each workers *ownership* of a range matters?
>
I think so, but it seems from above text that is mainly required for
merge iterator which probably will be used in merge join.
> My
> patch doesn't care which values belong to which workers. And, it
> focuses quite a lot on dealing well with the memory bandwidth bound,
> I/O bound part of the sort where we write out the index itself, just
> by piggy-backing on tuplesort.c. I don't think that that's useful for
> a general-purpose executor node -- tuple-at-a-time processing when
> fetching from workers would kill performance.
>
Right, but what is written in text quoted by you seems to be do-able
with tuple-at-a-time processing.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2016-10-21 10:57:15 | Re: [RFC] Transaction management overhaul is necessary? |
Previous Message | Amit Kapila | 2016-10-21 10:49:57 | Re: Parallel tuplesort (for parallel B-Tree index creation) |