Wednesday, February 17, 2010

ZFS data protection comparison

ZFS now offers triple-parity raidz3. Conceptually, raidz3 is an N+3 parity protection scheme. Today, there are few, if any, other implementations of triple parity protection, so when we say "raidz is similar to RAID-5" and "raidz2 is similar to RAID-6" there is no similar allusion for raidz3. I prefer to say "raidz3 is like raidz2 with one additional level of parity protection. But how much better is raidz3 than raidz2? To help answer that question, I used the simple Mean Time to Data Loss (MTTDL) model to calculate the data retention capabilities of the possible configurations of 12 disks under ZFS. To be fair, the same model applies to other RAID implementations, but I'll use the ZFS terminology here.

In this MTTDL model, the configuration includes N total disks. If the data protection scheme is raidz3, then the minimum N = 1 data disk + 3 parity disks = 4. You can add more data disks to increase the overall available space, so if N=6 then you have 3 data disks + 3 parity disks.

The model uses the Mean Time between Failure (MTBF) as specified in a vendor's datasheet. It also uses a Mean Time to Repair (MTTR) which includes both the logistical repair time and any data reconstruction required. The simple model calculates MTTDL as:
For non-protected schemes (dynamic striping, RAID-0)
MTTDL[1] = MTBF / N
For single parity schemes (2-way mirror, raidz, RAID-1,RAID-5):
MTTDL[1] = MTBF2 / (N * (N-1) * MTTR)
For double parity schemes (3-way mirror, raidz2, RAID-6):
MTTDL[1] = MTBF3 / (N * (N-1) * (N-2) * MTTR2)
For triple parity schemes (4-way mirror, raidz3):
MTTDL[1] = MTBF4 / (N * (N-1) * (N-2) * (N-3) * MTTR3)
A graph the results for combinations of 12 disks looks like:



The results are consistent with previous MTTDL analysis. The 12-disk Stripe has an MTTDL of 6.7 years, which isn't very good (annualized rate = 15%) whereas the 12 disk 4-way stripe MTTDL is 2.75e+13 years (annualized rate = 3.63e-12%) and the 12 disk raidz3 MTTDL is 1.67e+11 years (annualized rate = 5.99e-10%).

The theory behind raidz3 will allow more parity disks. But at some point, the system design will be dominated by common failures and not the failure of independent disks. I hope this model will be useful for you to evaluate the data retention of your storage system.

Thursday, February 4, 2010

ZFS training in Atlanta, March 16-18, 2010

I will be presenting a 3-day training session for systems and storage administrators on ZFS and NexentaStor in the Atlanta area March 16-18, 2010. The team has put together a fantastic syllabus including in-depth exposure to the latest ZFS and NAS trends.


Attendees can choose to attend the three-day program, or the two-day advanced portion. The course is structured as follows:

  • Day 1: Introduction to ZFS and Nexenta Systems Storage Technologies
  • Day 2: De-duplication in a VM World
  • Day 3: Optimizing NAS Performance

Attendees should have some familiarity with storage concepts and terminology, but does not assume any knowledge of ZFS or familiarity with the NexentaStor storage appliance.
The course will include hands-on exercises with ZFS and NexentaStor.
Best of all, lunch will be provided each day.
To sign up or view the detailed syllabus, visit the nexenta-atlanta.eventbrite.com event registration site