Re: Performances issues with SSD volume ?

From: Thomas SIMON <tsimon(at)neteven(dot)com>
To: grb(at)skogoglandskap(dot)no
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Performances issues with SSD volume ?
Date: 2015-05-22 15:13:13
Message-ID: 555F4789.6060106@neteven.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hi Graeme, thanks for your 2 complete replys.

Le 19/05/2015 16:26, Graeme B. Bell a écrit :
>
>> After the change, I had the following behavior, and I don't understand
>> why : everything seems to work fine (load is ~6/7, when it was
>> previously ~25/30 on the HDD server), so the SSD server is faster than
> the HDD one, and my apps run faster too, but after some time (can be 5
>> minutes or 2 hours), the load average increases suddently (can reach 150
>> !) and does not decrease, so postgres and my application are almost
>> unusable. (even small requests are in statement timeout)
> =====
>
> A braindump of ideas that might be worth investigating
>
>
> 1. Cheaper SSDs tend to have high burst performance and poorer sustained write performance.
> Your SSD may actually have a slower underlying performance that is having trouble keeping up.
> Check online for reviews of your drives sustained performance. Shouldn't be lower than an HDD of course...
>
> 2. Your SSD, as it fills up, has to do more and more work to manage wear-levelling and garbage-collection of cells that are being re-used, and more and more cell wipes and rewrites. Usually there is a reserve of cells that can be used immediately but then the controller has to start doing wipes and performance is crippled on some cheaper drives. It may help your SSD to set your RAID partitions to use only 90% of the capacity of a fresh drive. The extra 10% significantly reduces the complexity for the SSD controller to manage the disk and provides a greater pool of cells that can be used for burst write activity and wear-levelling. Basically, cheap SSDs hate being 100% full.

As you told in other reply, Intel SSD seems to be good disks.

I've got fastpath on my raid controller, that's why I setup WT on SSDs

megacli -ELF -ControllerFeatures -a0

Activated Advanced Software Options
---------------------------
Advanced Software Option : MegaRAID FastPath
Mode : Secured
Time Remaining : Unlimited
...

So it should be OK for WT ?

4. Remember to readup on readahead, filesystem 4k alignment, and IO scheduler choice.

schedule choice is noop now. 4k alignment ok
>
> 5. Look for any other tasks that are running. For example, we saw local SSDs suffer under one of the schedulers when bulk copying of hundreds of GB of data was occuring. There wasn't enough IO left for the random reads/writes to maintain their performance. The scheduler was giving all the IO to the copying routine. May have been NOOP or CFQ, it wasn't deadline.
My server is dedicated to postgres, no other tasks running.

6. If you have a BBU, perhaps the slowness is occuring there. Enable direct mode on your raid card and disable read-ahead caching on the raid controller. Check the battery and maybe even the battery controller too. We had a battery/controller that it turned out reported 'working fine' 99% of the time then randomly decided it wasn't fine, before recovering soon after (it wasn't recharging or self-testing, it was just broken, it seemed, and had weird overheating behaviour). Replacing them fixed the problem.

my current config is the following one on ssd raid volume
Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write
Cache if Bad BBU
I've found no errors like to one you had in log dump.
>
> 7. Firmware update your SSDs to the latest versions. Try this last after everything else. You can take disks out of RAID one at a time to do this, presuming you have a hotspare or spare. Remember to check the array has fully resilvered before you put the reflashed drive back in again. Could be a bad firmware.
>
> 8. As someone said, could be memory rather than disk related. Check your NUMA settings ( I think I keep NUMA off) / interleaving on.
Numa seems to be enabled. Not sure how to disable it; I've set
vm.zone_reclaim_mode = 0 but if seems to be

numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 20 21 22 23 24 25 26 27 28 29
node 0 size: 128966 MB
node 0 free: 899 MB
node 1 cpus: 10 11 12 13 14 15 16 17 18 19 30 31 32 33 34 35 36 37 38 39
node 1 size: 129021 MB
node 1 free: 890 MB
node distances:
node 0 1
0: 10 21
1: 21 10

I now use interleaving on with "numactl --interleave=all
/etc/init.d/postgresql start"

>
> 9. Random thought. Could it be something funny like you have code that is using lots of locks and your apps or DB is hitting deadlock at some level of activity? Check pg_locks to make sure. Would be strange if you had this now but not with the HDDs though. But maybe the SSD is letting more things start running together, and the lock problem is emerging at some level of activity.
That is indeed a possibility. I will check next time I will do the switch.
>
> 10. Check your VM settings to make sure that you're not in a stuation where lots of data is waiting to be cleared all at once. e.g. dirty_background_bytes, dirty_bytes, and then stalling all other IO.
Here is my parameters for this.
Seems to be low values, but i don't know if they are good and how to
tune them.

vm.dirty_background_bytes = 8388608
vm.dirty_background_ratio = 0
vm.dirty_bytes = 67108864
vm.dirty_ratio = 0

>
> 11. Check your crontabs and cron.d to make sure nothing else is running which might lock tables or nuke performance. Backup routines, manual vacuums/statistics, etc.
I've nothing here.
>
> 12. Make sure you don't have wal_buffers set at some crazy high level so that a transaction commit is causing insane amounts of data to get cleared out of cache synchronously.
wal_buffers is now set to -1 in 9.3+ versions, the setting is automatic now.
> 13. Just out of curiosity are you using 2-stage/synchronous commit to the slave? Maybe the problem is on the slave.
No, I'm using hot standby in asynchronous
>
> 14. Have you tested what happens if you go back to the HDDs, e.g. does the problem persist or go away? Maybe it's coincidence it arrived with the SSDs.
The problem, in this proportions, does not appears anynore when I go
back to HDD server. Behavior was less performance, but stable in duration.

>
> Graeme Bell
>

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Thomas SIMON 2015-05-22 15:29:04 Re: Performances issues with SSD volume ?
Previous Message Glyn Astill 2015-05-22 14:51:43 Re: Performances issues with SSD volume ?