Fwd: Multiple disks: RAID 5 or PG Cluster

From: Yves Vindevogel <yves(dot)vindevogel(at)implements(dot)be>
To: pgsql-performance(at)postgresql(dot)org
Subject: Fwd: Multiple disks: RAID 5 or PG Cluster
Date: 2005-06-17 21:31:00
Message-ID: aefbbdac96314b629fa02249d7ed246b@implements.be
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

BTW, tnx for the opinion ...
I forgot to cc list ...

Begin forwarded message:

> From: Yves Vindevogel <yves(dot)vindevogel(at)implements(dot)be>
> Date: Fri 17 Jun 2005 23:29:32 CEST
> To: mudfoot(at)rawbw(dot)com
> Subject: Re: [PERFORM] Multiple disks: RAID 5 or PG Cluster
>
> Ok, striping is a good option ...
>
> I'll tell you why I don't care about dataloss
>
> 1) The database will run 6 months, no more.
> 2) The database is fed with upload files. So, if I have a backup each
> day, plus my files of that day, I can restore pretty quickly.
> 3) Power failure is out of the question: battery backup (UPS), disk
> failure is minimal change: new server, new discs, 6 months ...
>
> We do have about 500.000 new records each day in that database, so
> that's why I want performance
> Records are uploaded in one major table and then denormalised into
> several others.
>
> But, I would like to hear somebody about the clustering method. Isn't
> that much used ?
> Or isn't it used in a single machine ?
>
> On 17 Jun 2005, at 22:38, mudfoot(at)rawbw(dot)com wrote:
>
>> If you truly do not care about data protection -- either from drive
>> loss or from
>> sudden power failure, or anything else -- and just want to get the
>> fastest
>> possible performance, then do RAID 0 (striping). It may be faster to
>> do that
>> with software RAID on the host than with a special RAID controller.
>> And turn
>> off fsyncing the write ahead log in postgresql.conf (fsync = false).
>>
>> But be prepared to replace your whole database from scratch (or
>> backup or
>> whatever) if you lose a single hard drive. And if you have a sudden
>> power loss
>> or other type of unclean system shutdown (kernel panic or something)
>> then your
>> data integrity will be at risk as well.
>>
>> To squeeze evena little bit more performance, put your operating
>> system, swap
>> and PostgreSQL binaries on a cheap IDE or SATA drive--and only your
>> data on the
>> 5 striped SCSI drives.
>>
>> I do not know what clustering would do for you. But striping will
>> provide a
>> high level of assurance that each of your hard drives will process
>> equivalent
>> amounts of IO operations.
>>
>> Quoting Yves Vindevogel <yves(dot)vindevogel(at)implements(dot)be>:
>>
>>> Hi,
>>>
>>> We are looking to build a new machine for a big PG database.
>>> We were wondering if a machine with 5 scsi-disks would perform better
>>> if we use a hardware raid 5 controller or if we would go for the
>>> clustering in PG.
>>> If we cluster in PG, do we have redundancy on the data like in a
>>> RAID 5
>>> ?
>>>
>>> First concern is performance, not redundancy (we can do that a
>>> different way because all data comes from upload files)
>>>
>>> Met vriendelijke groeten,
>>> Bien à vous,
>>> Kind regards,
>>>
>>> Yves Vindevogel
>>> Implements
>>>
>>>
>>
>>
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 8: explain analyze is your friend
>>
>>
> Met vriendelijke groeten,
> Bien à vous,
> Kind regards,
>
> Yves Vindevogel
> Implements
>

Attachment Content-Type Size
Pasted Graphic 2.tiff image/tiff 5.6 KB
Pasted Graphic 2.tiff image/tiff 5.6 KB

Browse pgsql-performance by date

  From Date Subject
Next Message PFC 2005-06-18 16:00:14 Re: Multiple disks: RAID 5 or PG Cluster
Previous Message Yves Vindevogel 2005-06-17 21:30:27 Fwd: Multiple disks: RAID 5 or PG Cluster