Re: RES: Initial database loading and IDE x SCSI

From: Mark Lewis <mark(dot)lewis(at)mir3(dot)com>
To: carlosreimer(at)terra(dot)com(dot)br
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-performance(at)postgresql(dot)org
Subject: Re: RES: Initial database loading and IDE x SCSI
Date: 2006-06-02 20:13:29
Message-ID: 1149279209.20798.108.camel@archimedes
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, 2006-06-02 at 16:54 -0300, carlosreimer(at)terra(dot)com(dot)br wrote:
> > <carlosreimer(at)terra(dot)com(dot)br> writes:
> > > I would like to know if my supposition is right.
> >
> > > Considering an environment with only one hard disk attached to
> > a server, an
> > > initial loading of the database probably is much faster using an IDE/ATA
> > > interface with write-back on than using an SCSI interface.
> > That´s because of
> > > the SCSI command interface overhead.
> >
> > I *seriously* doubt that.
> >
> > If you see a difference in practice it's likely got more to do with the
> > SCSI drive not lying about write-complete ...
> >
>
> Many thanks for the answers! There are some more thinks I could not
> understand about this issue?
>
> I was considering it but if you have a lot of writes operations, will not
> the disk cache full quickly?
>
> If it´s full will not the system wait until something could be write to the
> disk surface?
>
> If you have almost all the time the cache full will it not useless?
>
> Should not, in this scenary, with almost all the time the cache full, IDE
> and SCSI write operations have almost the same performance?
>

This is the ideal case. However, you only get to that case if you use
large transactions or run with fsync=off or run with a write cache (like
IDE drives, or nice RAID controllers which have a battery-backed cache).

Remember that one of the important qualities of a transaction is that
it's durable, so once you commit it the data is definitely stored on the
disk and one nanosecond later you could power the machine off and it
would still be there.

To achieve that durability guarantee, the system needs to make sure that
if you commit a transaction, the data is actually written to the
physical platters on the hard drive.

This means that if you take the naive approach to importing data (one
row at a time, each in its own transaction), then instead of blasting
data onto the hard drive at maximum speed, the application will wait for
the platter to rotate to the right position, write one row's worth of
data, then wait for the platter to rotate to the right position again
and insert another row, etc. This approach is very slow.

The naive approach works on IDE drives because they don't (usually)
honor the request to write the data immediately, so it can fill its
write cache up with several megabytes of data and write it out to the
disk at its leisure.

> Thanks in advance,
>
> Reimer
>

-- Mark

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message carlosreimer 2006-06-02 20:37:01 RES: RES: Initial database loading and IDE x SCSI
Previous Message carlosreimer 2006-06-02 19:54:50 RES: Initial database loading and IDE x SCSI