On Sun, Jun 10, 2007 at 08:49:24PM +0100, Heikki Linnakangas wrote:
> Jim C. Nasby wrote:
> >On Thu, Jun 07, 2007 at 10:16:25AM -0400, Tom Lane wrote:
> >>Heikki Linnakangas <heikki(at)enterprisedb(dot)com> writes:
> >>>Thinking about this whole idea a bit more, it occured to me that the
> >>>current approach to write all, then fsync all is really a historical
> >>>artifact of the fact that we used to use the system-wide sync call
> >>>instead of fsyncs to flush the pages to disk. That might not be the best
> >>>way to do things in the new load-distributed-checkpoint world.
> >>>How about interleaving the writes with the fsyncs?
> >>I don't think it's a historical artifact at all: it's a valid reflection
> >>of the fact that we don't know enough about disk layout to do low-level
> >>I/O scheduling. Issuing more fsyncs than necessary will do little
> >>except guarantee a less-than-optimal scheduling of the writes.
> >
> >If we extended relations by more than 8k at a time, we would know a lot
> >more about disk layout, at least on filesystems with a decent amount of
> >free space.
>
> I doubt it makes that much difference. If there was a significant amount
> of fragmentation, we'd hear more complaints about seq scan performance.
>
> The issue here is that we don't know which relations are on which drives
> and controllers, how they're striped, mirrored etc.
Actually, isn't pre-allocation one of the tricks that Greenplum uses to
get it's seqscan performance?
--
Jim Nasby decibel(at)decibel(dot)org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)