Re: Block at a time ...

From: Craig James <craig_james(at)emolecules(dot)com>
To: Scott Carey <scott(at)richrelevance(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Block at a time ...
Date: 2010-03-22 23:46:13
Message-ID: 4BA80145.5010806@emolecules.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 3/22/10 11:47 AM, Scott Carey wrote:
>
> On Mar 17, 2010, at 9:41 AM, Craig James wrote:
>
>> On 3/17/10 2:52 AM, Greg Stark wrote:
>>> On Wed, Mar 17, 2010 at 7:32 AM, Pierre C<lists(at)peufeu(dot)com> wrote:
>>>>> I was thinking in something like that, except that the factor I'd use
>>>>> would be something like 50% or 100% of current size, capped at (say) 1 GB.
>>>
>>> This turns out to be a bad idea. One of the first thing Oracle DBAs
>>> are told to do is change this default setting to allocate some
>>> reasonably large fixed size rather than scaling upwards.
>>>
>>> This might be mostly due to Oracle's extent-based space management but
>>> I'm not so sure. Recall that the filesystem is probably doing some
>>> rounding itself. If you allocate 120kB it's probably allocating 128kB
>>> itself anyways. Having two layers rounding up will result in odd
>>> behaviour.
>>>
>>> In any case I was planning on doing this a while back. Then I ran some
>>> experiments and couldn't actually demonstrate any problem. ext2 seems
>>> to do a perfectly reasonable job of avoiding this problem. All the
>>> files were mostly large contiguous blocks after running some tests --
>>> IIRC running pgbench.
>>
>> This is one of the more-or-less solved problems in Unix/Linux. Ext* file systems have a "reserve" usually of 10% of the disk space that nobody except root can use. It's not for root, it's because with 10% of the disk free, you can almost always do a decent job of allocating contiguous blocks and get good performance. Unless Postgres has some weird problem that Linux has never seen before (and that wouldn't be unprecedented...), there's probably no need to fool with file-allocation strategies.
>>
>> Craig
>>
>
> Its fairly easy to break. Just do a parallel import with say, 16 concurrent tables being written to at once. Result? Fragmented tables.

Is this from real-life experience? With fragmentation, there's a point of diminishing return. A couple head-seeks now and then hardly matter. My recollection is that even when there are lots of concurrent processes running that are all making files larger and larger, the Linux file system still can do a pretty good job of allocating mostly-contiguous space. It doesn't just dumbly allocate from some list, but rather tries to allocate in a way that results in pretty good "contiguousness" (if that's a word).

On the other hand, this is just from reading discussion groups like this one over the last few decades, I haven't tried it...

Craig

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Greg Smith 2010-03-23 04:12:09 Re: Got that new server, now it's time for config!
Previous Message Dan Harris 2010-03-22 23:16:31 Re: Got that new server, now it's time for config!