From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Greg Stark <stark(at)enterprisedb(dot)com> |
Cc: | Simon Riggs <simon(at)2ndquadrant(dot)com>, Don Marvick <donmarvick(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: polyphase merge? |
Date: | 2009-02-04 15:17:55 |
Message-ID: | 2245.1233760675@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Greg Stark <stark(at)enterprisedb(dot)com> writes:
> Is this basically the same as our current algorithm but without
> multiplexing the tapes onto single files? I have been wondering
> whether we multiplex the tapes any better than filesystems can lay out
> separate files actually.
The reason for the multiplexing is so that space can get re-used
quickly. If each tape were represented as a separate file, there would
be no way to release blocks as they're read; you could only give back
the whole file after reaching end of tape. Which would at least double
the amount of disk space needed to sort X amount of data. (It's
actually even worse, more like 4X, though the multiplier might depend on
the number of "tapes" --- I don't recall the details anymore.)
The penalty we pay is that in the later merge passes, the blocks
representing a single tape aren't very well ordered.
It might be interesting to think about some compromise that wastes a
little more space in order to get better sequentiality of disk access.
It'd be easy to do if we were willing to accept a 2X space penalty,
but I'm not sure if that would fly or not. It definitely *wasn't*
acceptable to the community a few years ago when the current code was
written. Disks have gotten bigger since then, but so have the problems
people want to solve.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Jonah H. Harris | 2009-02-04 15:19:02 | Re: add_path optimization |
Previous Message | Greg Stark | 2009-02-04 15:12:37 | Re: add_path optimization |