Single drive versus RAID1
Madison Kelly
linux-5ZoueyuiTZhBDgjK7y7TUQ at public.gmane.org
Mon Oct 20 22:22:40 UTC 2003
Matthew Godycki wrote:
> Hi Madison,
>
> While I'm fairly familiar with the major RAID configurations (heck, a few years ago when I was a youngin' and used to teach data structures at UofT I even gave my students an intro to RAID) I wouldn't mind having a gander at your text. It might be interesting to see if maybe I can pick up some extra pros/cons that I haven't yet seen, especially with regard to the more interesting RAID configurations.
>
> Cheers,
> -Matt
Keep in mind that this is well over a year old document and it probably
needs some updating but the foundation hasn't changed. Hope you get some
use out of it.
Madison
-= See Below =-
[ IDE and SCSI stuff snipped ]
RAID:
So now you see the basic differences between IDE and SCSI. Well, what
happens if all the goodness of SCSI just isn’t enough? “RAID” stands for
“Redundant Array of Inexpensive Disks”. It’s funny to hear that when you
consider a decent entry-level server with RAID starts around $6,000! You
need to remember that the term was coined back when disk space was still
measured in $$/MB.
You don’t technically need SCSI drives to implement RAID; in fact there
is a trend in the performance home market right now for IDE drives to be
used in RAID arrays. I wouldn’t recommend it yet for people needing
high-levels of reliability or processing power though.
So what is a RAID array? Being here you probably have enough interest in
computers to have heard of RAID but unless you are slightly obsessed
with hard drive technology you probably haven’t learned much about it.
RAID has been, and to a large part still is the domain of higher-level
servers.
RAID describes three main abilities that can be implemented either alone
or in combination to best fit various scenarios. These features include
“stripping”, “mirroring” and “parity”.
Stripping, known as RAID level 0 or RAID0 is the process of using two or
more drives for simultaneous writing and reading. When a file is to be
written to a stripped array the data is divided into chunks and written
to the drives in the array at the same time. As a loose example you can
take a 10MB file and write it to a RAID0 array with two drives in
roughly the time it would normally take to write a 5MB file (twice the
speed). The same 10MB file could be written to an array with five drives
in roughly the time it would have taken a 2MB file to be written to a
single drive (five times as fast). Calculating the actual speed benefits
isn’t so cut and dry because of other overhead but you get a good idea.
Next up is “Mirroring” or RAID1. As its name implies, two drives are
mirror images of one another. If one drive fails the data is safe thanks
to the second identical drive. The down side is that 50% of the physical
hard drive space is wasted.
Finally we get to “Parity”, used in RAID3, 4, 5 and 6 but most popularly
in RAID5. Remember in math class you asked “where will I ever use this
in the real world?” Well my friends, Boolean algebra has allowed us a
very efficient way to protect data. Lets use a RAID5 array for this
example but first let me describe a RAID5 array.
In a RAID5 array you need a minimum of 3 disks. The more you add though
the better performance you gain and the more efficiently you use your
disk space. The trade off is you need an increasingly more powerful RAID
controller and that translates to a higher cost. In a RAID5 array
performance is increased by stripping data across the available drives
(RAID0). In a RAID0 array though a single disk failure will destroy all
the data because part of just about every file is on each disk. Parity
is added in RAID5 to deal with this.
Parity works by taking the data on each disk and using the Boolean “XOR”
cumulatively to come up with parity data. This last piece of redundant
data can be used to rebuild any one piece of missing data in an array
(one failed drive). For this reason RAID5 is described as making use of
N-1 disks in the array. In the minimum three drive array we have a 33%
waste of space (3-1 = 2 effective disks). In a five-disk array we
increase our efficiency to 20% wasted space (5-1 = 4 effective disks).
The down side to RAID5 is that when a failure occurs read times would
drop, potentially by a marked amount. This is because for every byte of
data being requested in a read the RAID controller must first run the
XOR on the remaining good data plus the parity byte before knowing what
the missing data is.
I am sure by now that my 20 minutes is almost up, thank you for
listening to me today and I hope that if this wasn’t too fast or
confusing and it at least wet your appetite to learn more about hard
disk technology. I will be doing a presentation at March’s TLUG meeting
where I will go into much more detail about underlying disk technology
and how it relates to servers and Linux. If you would like to learn more
I encourage you to join me then.
Does anyone have any questions?
Thank you all very much for your attention! The paper for this talk will
be available on my web site as soon as some work is done. At that time
there will be links that will allow you to study this and other topics
in greater detail. Good night!
Notes:
[ Table Snipped (wouldn't format in ASCII) ]
RAID Levels:
Level 0 = Stripping, not a “true” RAID level because there is no
redundancy. RAID0 is designed to increase performance by cutting data
into uniform sized blocks and sequentially writing them to the disks in
the array. The total capacity (T) is measured by multiplying the numbers
of disks in the array (x) by the storage capacity of the smallest
drive(s). T=x*s
Level 1 = Mirroring, the oldest type of RAID. Very popular to it’s ease
of use and high-reliability. Write times can be slightly diminished
because of the need to write the data across twice the number of disks
but a good hardware RAID controller often negates this. Read times are
quicker; almost double the speed because the data needs to only be read
from one drive. The performance hit during a failure is minimal and the
processing power needed to implement RAID1 is small. The main drawback
is the initial cost to configure because of the inefficient use of disk
space. RAID1 requires an even number of drives. The total storage
capacity (T) is measured by multiplying the number of drives in the
array (x) by the capacity of the smallest drive (s) and dividing by 2.
T=(x*s)/2
Level 2 = Defined but rarely used because it requires special hard
drives that manufacturers are reluctant to build. RAID2 splits data at
the bit level and stripes it across the data disks. For each strip of
data Hamming ECC data is generated and stored on multiple parity disks.
The main advantage of RAID2 was its ability to correct single bit errors
“on the fly” because both the good data and the parity data were read an
analyzed on every read. This Error Correction Code (ECC) is now a
standard feature within hard drives, negating most of the benefits of
RAID2 right there. RAID2 also suffered from modest performance and
reliability when compared to other RAID levels, particularly when the
number of disks needed is considered. A standard RAID2 array called for
10 data disks plus 4 parity disks or 32 data disks plus 7 parity disks.
Level 3 = RAID3 is still used today but isn’t as popular as other
levels. RAID3 implements striping with parity and uses a dedicated
parity disk. The data is split into blocks usually smaller than 1024
bytes allowing it to make efficient use of the stripping ability. The
downside is RAID3 has a single parity disk, which causes a bottleneck
when there are many I/O transactions at the same time. RAID3 is best
suited for environments where redundancy is required and most of the
disk access involves large file read/writes like you would see creating
multimedia. Total capacity (T) is measured by multiplying the number of
drives in the array (x) by the storage capacity of the smallest disk (s)
minus one. T=(x*s)-1
Level 4 = When they wrote that song “Stuck in the Middle with You” they
must have been thinking of RAID4. Like RAID3 and RAID5 it also stripes
data across disks with parity. Like RAID3 it uses a dedicated parity
disk that can be a bottleneck but it uses larger data blocks like RAID5.
The larger data blocks improve the performance when there are a high
number of simultaneous I/O transactions but makes less efficient use of
stripping. You can calculate the total storage like you did in RAID3,
T=(x*s)-1
Level 5 = Perhaps the most popular RAID level used when both performance
and reliability is required. RAID5 stripes both data and parity
information across all the disks in the array removing the bottleneck
seen in RAID3 and RAID4. It uses larger data blocks than RAID3 making it
best suited for multiple simultaneous I/O transaction (web server
anyone?). During a failure the performance will suffer more than if just
the parity disk failed in a RAID3 or RAID4 array because read/writes
will require an analysis of parity data to rebuild missing data. Total
storage capacity is calculated like RAID3 and RAID4, T=(x*s)-1
Level 6 = RAID6 is very similar to RAID5 except that two blocks of
parity information is recorded allowing the array to survive two
simultaneous drive failures. This isn’t a popular RAID level though
because the ability to have hot spares in a RAID5 array means the time
when the array is vulnerable is very short. With a hot spare in a RAID5
array you don’t suffer the write performance hit seen it RAID6 where two
separate parity blocks must be calculated and recorded on each write.
You can calculate the total capacity of a RAID6 array similar to RAID5
except you lose two disks to parity information, T=(x*s)-2.
Level 7 = This isn’t a true RAID level in that this RAID level is
proprietary meaning the controller and standard is owned and controlled
by one company. They try to solve some of the problems of RAID3 and
RAID4 but details are sketchy. Not a recommended RAID level.
Combining Levels = Not long after RAID began life did people start
wondering about combining RAID levels to get the benefits of two levels.
Before I talk about those levels I want to make a few comments. The
order in which the RAID levels are noted is important. For example RAID
10 is not RAID 01. In this example, RAID 10 is an array of stripped
mirrors as apposed to RAID 01, which is a mirror of stripped arrays. Be
careful when faced with RAID 53, this can be RAID 03 or even 30,
depending on it’s implementation.
Level 01 and 10 = In RAID 01 (zero one) two striped arrays of equal size
are created then data is duplicated (mirrored) on each array. This
provides good fault tolerance by being able to survive multiple disk
failures so long as they are on the same array. Performance is also good
because of the speed benefits of writing to a striped array and the
speed benefits of reading from a mirrored array.
RAID 10 (one zero, not ten) creates mirrored pairs then stripes the data
across the pairs. This provides a slightly higher level of reliability
because the loss of a drive only affects the mirror it belongs to. This
helps minimize degradation and rebuild time because less data is
mirrored per set.
Both RAID 01 and RAID 10 have the benefit of redundancy without the
overhead of parity. Both also require a minimum of four disks and the
total number must be even. Calculate the total capacity of the array
like you do in RAID1, T=(x*s)/2.
Level 03 and 30 = (zero three and three zero) Hope you have done your
mental Yoga; these combinations are probably the most difficult to
mentally picture.
RAID 03 is built by using stripped sub-arrays in a RAID3 array. In this
configuration the byte data and parity is stripped across multiple RAID0
arrays (minimum 3 RAID0 arrays each with a minimum of 2 drives). This
gives performance levels closer to RAID0 but a little slower because
parity still needs to be calculated and written. Assuming all the disks
are the same size the total capacity (T) is measured by calculating the
capacity of each RAID0 array (ra) and then multiplying the number of
RAID0 arrays (rb) minus 1, T=ra*(rb-1).
RAID 30 is just the opposite, stripping data across two or more
RAID3 sub-arrays. This level is the more popular implementation of
combining block striping with byte stripping and parity. It provides
better fault tolerance and performance because each RAID3 sub-array in
the RAID0 stripe is independently protected. It also makes more sense to
byte-stripe blocks of data because you are making smaller pieces out of
larger chunks of data. Assuming again that all the drives in the array
are the same size you can calculate the total capacity of the array (T)
by multiplying the number of drives in the RAID3 sub-array (ra) minus 1
then multiplying the number of sub-arrays in the stripe (rb), T=(ra-1)*rb.
In both cases RAID 03 and RAID 30 provide the best transfer rates
but still suffer from bottlenecks when the number of simultaneous I/O
transactions increase. These configurations are best suited for
applications where large files are used by a few people (like in
multimedia) and regular RAID3 isn’t fast enough.
Level 05 and 50 = (zero five and five zero) Apply the differences
discussed between RAID3 and RAID5 over what was just said about RAID
03/30 and you have RAID 05/05. The main difference between RAID 03/30
and RAID 05/50 is that RAID 05/50 increases performance under
simultaneous I/O transactions by using larger blocks of data and by
removing the bottleneck of a single parity disk. This makes RAID 05/50
better suited for use in scenarios where many users are requesting
simultaneous read/writes to the array and RAID5 alone doesn’t provide
enough speed. Calculate total capacity as you did in RAID 03/30.
Level 15 and 51 = (one five and five one) It doesn’t get any more
reliable or inefficient than this! RAID 15/51 is the most reliable
method of storing data in a single server, combining the raw
availability of RAID1 mirrors with the performance and reliability
benefits of RAID5. Realistically if this level of reliability was
required you would be interested in high availability servers where
everything, not just the disk array was redundant.
References:
· The “T10” committee responsible for developing the SCSI standards:
http://www.t10.org/
· The “T11” committee responsible for developing the FC, HIPPI and IPI
standards:
http://www.t11.org/
· The “T13” committee responsible for developing the ATA standards:
http://www.t13.org/
· “The PC Guide”, by Charles M. Kozierok. THE site on the web for
double-checking yourself on computer hardware. If he ever reads this,
“Thanks!”:
http://www.pcguide.com/
--
The Toronto Linux Users Group. Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml
More information about the Legacy
mailing list