From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | Peter Geoghegan <pg(at)bowt(dot)ie>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)heroku(dot)com> |
Subject: | Re: Tuplesort merge pre-reading |
Date: | 2017-04-18 08:05:40 |
Message-ID: | 5db9d579-efdb-857e-82a8-7763313e7931@iki.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 04/14/2017 08:19 AM, Peter Geoghegan wrote:
> BTW, I'm skeptical of the idea of Heikki's around killing polyphase
> merge itself at this point. I think that keeping most tapes active per
> pass is useful now that our memory accounting involves handing over an
> even share to each maybe-active tape for every merge pass, something
> established by Heikki's work on external sorting.
The pre-read buffers are only needed for input tapes; the total number
of tapes doesn't matter.
For comparison, imagine that you want to perform a merge, such that you
always merge 7 runs into one. With polyphase merge, you would need 8
tapes, so that you always read from 7 of them, and write onto one. With
balanced merge, you would need 14 tapes: you always read from 7 tapes,
and you would need up to 7 output tapes, of which one would be active at
any given time.
Those extra idle output tapes are practically free in our
implementation. The "pre-read buffers" are only needed for input tapes,
the number of output tapes doesn't matter. Likewise, maintaining the
heap is cheaper if you only merge a small number of tapes at a time, but
that's also dependent on the number of *input* tapes, not the total
number of tapes.
- Heikki
From | Date | Subject | |
---|---|---|---|
Next Message | Jan Michálek | 2017-04-18 08:13:57 | Re: Other formats in pset like markdown, rst, mediawiki |
Previous Message | Kang Yuzhe | 2017-04-18 07:54:23 | Re: On How To Shorten the Steep Learning Curve Towards PG Hacking... |