Linux software raid with 2-6 disks ext3/ext4 performance tests.
Lately I've got 6 500GB SATA 7200 rpm disks in my hands. So I decided to do some performance tests of linux software raid.
There are a lot of legends about raid levels, which are better for database or file storing. Also there are informations how much which raid layout loads CPU.
So I decided to verify those.
Test setup was intel C2D E8200 with 4GB DDR2 RAM and 6 SATA 7200 rpm disks. (Ubuntu, bonnie++ 1.96)
Disks were: 3*Seagate ST3500418AS, 1*Seagate ST3500620AS, 1*Samsung HD502IJ, 1*Samsung HD502HJ.
The biggest drawback of this test configurations is that both Samsung disks are like 30% slower then the fastest Seagate disks here,
and 1 older Seagate is about 10% slower, then newer ones. So I decided to use first 3*418AS, then add older Seagate then use Samsungs.
The perfect setup would be 6 identical disks, but life is never ideal so I'm happy to be able to test those 6 :).
Again performance of ext3 and ext4 was compared.
Each time, I created new filesystem with proper extended options for tested raid layout. So filesystem shape wasn't and issue.
I started with 2 disk setup. I haven't tested mirror layouts at all, cause there got maximum performance as the slowest disk in the set or sometimes even worse,
so I'm not interested in using raid1.
Again no character I/O tests (-f option) and I'm cutting of files results and latecies this time.
So numbers shows: sequential write speed(KB/s), seq write CPU load(%), sequential re-write speed(KB/s), CPU load(%), sequential read(KB/s), CPU load(%), seeks(#), seek CPU load(%).
2 disks results:
raid0 ext3: 125244,11,60787,7,153015,5,298.6,3
raid0 ext4: 152883,5,63532,6,188576,6,366.9,3
With those results you can see that ext4 is running bonnie++ tests faster. Everything is better: 20% write speed, 5% faster rewrites, 22% faster reads and 22% faster seeks. CPU load is lower in most cases with ext4. So considering fs choice ext4 is better (at least for 2disk raid0 :)).
3 disks results:
Again ext4 is faster for both layouts (r0 and r5). And there is no surprise raid0 is faster than r5 in all categories, but CPU load is lower for r5.
Comparing raid0 results for 2 and 3 disks (ext4) there is no surprise 3 disk setup is faster. 39% faster writes, 47% faster rewrites, 41% faster reads and 13% faster seeks. CPU load higher with 3 disks set then 2 disks.
There are more possibitilies with 4 disks. So here is raid0, raid5, raid6 and raid10.
4 disks results:
raid0 ext3: 253306,36,124717,15,346068,12,466.7,4
raid5 ext3: 103054,9,58472,6,213668,7,349.0,3
raid6 ext3: 85059,6,41265,4,147758,5,324.2,3
raid10 ext3: 127874,9,73172,9,173371,6,434.3,4
raid0 ext4: 287139,14,127041,13,352917,10,520.1,6
raid5 ext4: 186413,7,64775,6,221985,7,409.6,4
raid6 ext4: 130736,3,44638,4,148480,5,372.7,3
raid10 ext4: 144713,7,59525,7,176006,6,489.6,4
The disk added in this layout was slower then first 3, so that fact have some impact on results. Guess there is not that big performance gain, than it would be with identical one. For raid0 only write is quite faster and seeks are much better, rest is almost the same comparing ext3 and ext4. Raid5 much faster with ext4. Raid6 a lot faster with writes, rest is almost the same (small gain with seeks). Funny thing for raid10, writes faster (like 15%), but rewrites slower 19%, reads comparable and seeks a bit better for ext4.
Anyway forget ext3/4 comparison. There is much more interesting thing going on. Raid5/6/10 performance. But wait, for diffrent filesystems we got diffrent winners here:). For writes and rewrites with ext3 raid10 is the fastest. Reads are fastest in raid5 and raid10 is second fast here. But for ext4 raid5 wins in all categories. Strange but true. I have some results of hardware raids in my large archive, so I have to compare them too. But I don't remember If I got any ext4 results. If not I'm gonna run some more test with hardware raids. But at the moment looks like ext4 for software raid10 is a drawback, and I'm not quite sure why and as I mentioned before I was using stride and stipe-widht calculator, so that was not the issue here. But in every category raid6 is the slowest. So for performance do not use it, but I've got so bad experience with failed raids5 so I cannot recomend that level even if this is the fastest with ext4. You have to decide yourself, cause you can see the performance results. So we have to look at speed gains after adding another disk here. I'm looking at ext4 gains. And for raid0 it's 34% in writes, 36% in rewrites, 32% in reads and 25% in seeks. Really nice gain considering added disk slower the other ones. And one more thing, I guess having sequential read speed at 350MB/s and write speed at 287MB/s is quite impressive with not the best PC around with not the fastest pc disks avaiable at the moment. Of course with raid0 you got no data protection, but for ppl working with big files those results should be interesting. Raid5 gains are quite impressive too: 43% with writes, 33% with rewrites, 58% with reads and 30% with seeks. So 1 more disk and you go like 40% faster average (and remember that this one was a bit slower then 3 other disks). So adding additional disk looks better than you can think.
So let's go to 5 disk layouts.
5 disks results:
raid0 ext3: 306561,48,151865,19,405209,13,546.9,5
raid5 ext3: 128620,14,77241,8,278531,9,430.3,4
raid6 ext3: 113310,10,59741,6,207506,6,403.8,4
raid0 ext4: 358141,22,152093,15,445002,14,635.0,5
raid5 ext4: 236163,11,89732,9,281769,9,472.7,3
raid6 ext4: 124686,1,54307,4,165431,5,227.0,1
Almost nothing surprising here. Ext4 faster than ext3 except raid6 rewrites, read and seeks. I'm not sure why. Looks like I have to run some tests with hardware platform to see if this happens when there are more disks in raid set. Or maybe those are some bad results. I'm sure I copied then perfectly, I/O of tested disks were quieted down for every test. So environment was the same. But results look strange.
But I'm curious of speed gain with another disk (again slower one).
Raid0 is of course faster in every category (for both ext3 and ext4). For ext4 it's 24% for writes, 20% for rewrites, 26% for reads and 22% for seeks. Looks ok, cause with 25% more disks (1/4) you got almost that amount of performance gain.
Raid5 is faster for both ext3 and ext4 with additinal disk. For ext4 it's 27% for writes, 38% for rewrites, 27% for reads and 15% for seeks. Nice gains too, and to be honest speeds are impressive to. But ok, having 5 disks in home pc is kinda expensive to buy and could be shocking when you got electricity bill ;).
Hard to tell something about raid6 gains cause some speed are lower some speeds are higher. Looks confusing and maybe it has something to do with raid layout (I mean chunk size, layout or n,f,o for raid10) or maybe it's slower disk fault.
And last test with 6 disks. Again slow Samsung added to raid, but here are resutls.
6 disks results:
raid0 ext3: 353925,59,179870,22,486911,16,648.4,5
raid5 ext3: 145828,17,92977,11,370512,13,532.2,7
raid6 ext3: 129762,14,77524,9,289665,9,443.6,4
raid10 ext3: 187909,21,93558,12,263770,8,610.8,6
raid0 ext4: 423795,30,182493,21,525318,16,729.8,5
raid5 ext4: 254794,13,110805,12,350398,11,572.8,6
raid6 ext4: 216581,9,86729,9,290029,9,496.3,5
raid10 ext4: 213449,14,80321,9,260611,9,668.3,7
So here we go. Again ext3 and ext4 comparison is confusing. Cause sometimes ext3 have better results sometimes ext4. So guess you have to choose your setup (raid level, file system) depending on what you gonna do with this filesystem (more reads or more writes) or do additional tests. Of course speed winner is raid0, no surprise. More than 0.5GB/s read speed and 0.4GB/s write speed. So you can easly utilize 1Gb connection with such setup. To be honest almost each of those can put more that 1Gbps data to network, and looks like every of those can maintain 1Gbps writes. Really nice.
With ext3 for writes with data protection raid10 seems the fastest, for reads raid5 is a champ.
With ext4 writes and reads are the best with raid5, raid6 is second and raid10 is a bit slower then raid6.
Considering seeks raid10 is the best with ext3 and ext4, so this could be a reason why people choose it. Raid6 is the worst in this category.
And what about gains.
Raid0 ext4 +18% writes, +20% rewrites, +18% reads and +15% seeks. Gain proper to raid setup change (1 more disk).
Raid5 ext4 +8% writes, +23% rewrites, +24% reads and +21% seeks. Low gain with writes, but rest seems resonable.
Raid6 ext3 +15% writes, +29% rewrites, +40% reads and +10% seeks. Some gains better than I would suppose some worse. But average as it should be.
Raid10 ext4 (2 disks gain) +46% writes, +35% rewrites, +47% reads and +36% seeks. Gain proper to amount of disks added (50%).
Choosing proper raid level is a hard task. You should really know what you want from your disk subsystem and you should know what protection level do you need. Sometimes you have to choose your raid level when designing your future environment so there is no place to run tests.
There is an opinion that raid10 is the best solution for fastest writes and it's the fastest in seeks. Results shows that seeks are true, but
writes are not always the fastest with raid10. This could be of course fault of this test environment, cause not all disks were identical.
Some people think that raid10 protects data better than raid5. But that is not true. Cause raid10 is stripe of mirrors, so it possible
to lose both disks in one mirror set and then raid10 and all its data is unavailable. Same for raid5, 2 disks down and all data gone.
Raid6 is better cause with 2 disks down, data is still available, but of course without protection anymore. And as results show raid6 is slower
in amost every case here. So better protection is payoff for lower performance.
I'm not sure why ppl don't use raid0 in environments when you need the best performance and costs don't count. You need then another way
to protect your data (storage mirroring, clustering, etc.), but speed gain could be very nice.
When looking at CPU load for each raid level only raid0 seems to have big impact on that matter. Considering that raid0 is almost never used in production
environments (I mean some photo and movie editors use it, but for temporary data) CPU load shouldn't be important factor for choosing raid level.
At the moment my most common choice is raid6. Because always data availability is more important than performance.
But for pure database disk subsystem i would choose raid10 or with another data protection raid0 even.
Guess that for hardware raid mine choice would be ext4 but at the moment with CentOS 4.x and CentOS 5.x ext3 seems more reasonable.
And adding a disk to raid seems nice way to improve your performance, especially when you disk amount in raid at the moment is low. Less disks in raid bigger performance gain by adding new disk. Looks like 10 disk raid performance gain is too low to call spending additional money reasonable. At least for soft raids with sata disks. But seems gain is kinda linear in most cases, so with identical disks you always got like something lower then (or around) 10% for adding eleventh disk to raid set. And you have to remember that sometimes you limit is connections speed (for iSCSI or FC storage) or controller and bus speed for directly connected storage.
I hope you could choose your raid level a bit more easier after reading this.