Re: Reliability with RAID 10 SSD and Streaming Replication

From: Cuong Hoang <climbingrose(at)gmail(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Reliability with RAID 10 SSD and Streaming Replication
Date: 2013-05-16 23:52:00
Message-ID: CAAE-9WFa+3qv3V9zE0kyWuhdbPjNEJPW8F_4xE=P6Ft8ZqfiWA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Thank you for your advice guys. We'll definitely turn off init.d script for
PostgreSQL on the master. The standby host will be disk-based so it will be
less vulnerable to power loss.

I forgot to mention that we'll set up Wal-e <https://github.com/wal-e/wal-e> to
ship base backups and WALs to Amazon S3 continuous as another safety
measure. Again, the lost of a few WALs would not be a big issue for us.

Do you think that this setup will be acceptable for our purposes?

Thanks,
Cuong

On Fri, May 17, 2013 at 8:39 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:

> On Thu, May 16, 2013 at 11:46 AM, Merlin Moncure <mmoncure(at)gmail(dot)com>wrote:
>
>> On Thu, May 16, 2013 at 1:34 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>> > On Thu, May 16, 2013 at 7:46 AM, Cuong Hoang <climbingrose(at)gmail(dot)com>
>> wrote:
>> >>
>> >> Hi all,
>> >>
>> >> Our application is heavy write and IO utilisation has been the problem
>> for
>> >> us for a while. We've decided to use RAID 10 of 4x500GB Samsung 840
>> Pro for
>> >> the master server. I'm aware of write cache issue on SSDs in case of
>> power
>> >> loss. However, our hosting provider doesn't offer any other choices of
>> SSD
>> >> drives with supercapacitor. To minimise risk, we will also set up
>> another
>> >> RAID 10 SAS in streaming replication mode. For our application, a few
>> >> seconds of data loss is acceptable.
>> >>
>> >> My question is, would corrupted data files on the primary server affect
>> >> the streaming standby? In other word, is this setup acceptable in
>> terms of
>> >> minimising deficiency of SSDs?
>> >
>> >
>> >
>> > That seems rather scary to me for two reasons.
>> >
>> > If the data center has a sudden power failure, why would it not take out
>> > both machines either simultaneously or in short succession? Can you
>> verify
>> > that the hosting provider does not have them on the same UPS (or even
>> worse,
>> > as two virtual machines on the same physical host)?
>>
>> I took it to mean that his standby's "raid 10 SAS" meant disk drive
>> based standby.
>
>
> I had not considered that. If the master can't keep up with IO using
> disk drives, wouldn't a replica using them probably fall infinitely far
> behind trying to keep up with the workload?
>
> Maybe the best choice would just be stick with the current set-up (one
> server, spinning rust) and just turn off synchrounous_commit, since he is
> already willing to take the loss of a few seconds of transactions.
>
> Cheers,
>
> Jeff
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tomas Vondra 2013-05-17 00:06:49 Re: Reliability with RAID 10 SSD and Streaming Replication
Previous Message Jeff Janes 2013-05-16 22:39:49 Re: Reliability with RAID 10 SSD and Streaming Replication