Ubuntu and RAID

Soldato
Joined
4 Dec 2002
Posts
3,978
Location
Bourne, Lincs
Thought I would ask this here rather than in the microserver thread.

I plan to get 4 x 2Tb drives and maid a RAID5 array within Ubuntu, but I also need to partition of about 250Gb separately.

Do you create the array, then create partitions on the array? Or is the array the partition?

Kimbie
 
Are you sure? I'm sure the raid array itself is like a partition, which you then format. And a quick google seems to confirm this.

In Ubuntu I'd use the debian installer (alternate install disk) or:

-Boot the live disk.
-Install the packages required for RAID (I think they're all there by default).
-Create the Array from a number of partitions as desired.
-Format the array with mke2fs using the "-R stride=n" option, where n is the number of 4k blocks that fit on an array stripe.
-Fire up the installer from the desktop
-When partitioning, choose advanced and use the array as /, but don't format.
-Choose the first disk for grub stage 1 (/boot can live on the array with grub2).
-after the installation open a terminal and:
Code:
mount
# check the array is still mounted to /target
sudo -s
mount -o sysfs sysfs /target/sys
mount -o proc proc /target/proc
chroot /target
apt-get update
# install any packages you needed to mount the array using apt-get or synaptic
-Reboot into the system.
 
Ok I have ubuntu installed on a separate 160Gb HDD, the 2Tb drives are yet to arrive.

So not sure how those instructions relate, damn this would be easier if i had the bloody disks lol

Kimbie
 
practice on some small partitions at the end of the current drive.

Yes, Linux will let you make an array out of partitions on the same physical disk.
 
If you have Ubuntu installed on a smaller drive.... check out this

System>Administration>Disk Utility>File>Create>RAID Array>Raid Level and one of the choices is RAID 5 !

Seems that you may be able to use this? But needs more research first... and I'm a *nix novice :(

one package you may need to install is mdadm ..... which is a software RAID admin tool :)
 
Are you sure? I'm sure the raid array itself is like a partition, which you then format. And a quick google seems to confirm this.


Thats software raid, you dont want to go near that if you can help it :)

edit - ah sorry didn't notice the op mentioned "within ubuntu". :o
 
Last edited:
The key thing here is the raid controller you are using ?

If it is an ICHxR then it is a "software" raid controller (even in Windows), if you have a proper raid controller then you can run hardware raid with no issues (Linux has drivers for pretty well all hardware raid controllers).
 
The key thing here is the raid controller you are using ?

If it is an ICHxR then it is a "software" raid controller (even in Windows), if you have a proper raid controller then you can run hardware raid with no issues (Linux has drivers for pretty well all hardware raid controllers).

I won't be using a raid controller just doing it in software will be using HP microserber
 
I have been playing with mdadm for the last couple of months and if you are not command line shy then it is actually pretty easy to setup.

I use parted / gparted to create unformatted partitions on the drives aligning to the 4K sectors and then use these partitions and mdadm to create an array.

If you gparted each drive with a 65GB partition and a 1855GB Partition (2TB comes in at around 1920GB from memory) as data, or whatever.

You could get mdadm to create a raid5 array from the 65GB partitions giving a 260GB array (a little less when formatted) and do the same for the second partitions on your drives to give a second array of 7,420GB.

Use whatever you want to format them.

The wisdom of sharing a disk with two raid5 arrays may be questionable, especially raid5 software arrays as raid5 in software will be pretty CPU intensive for working out the hashes and when done on two arrays at the same time.... well ... you may see a slowdown on the server.

If your data is mission critical with the requirement for instant or quick recovery and the risk of regular updates throughout the day then raid 5 or even raid10 many be well suited. If the data is not updated very often and is only for home use (NAS / Media server) then why not have a stripe set (raid0) or a JBOD array and two disks as copies for the data. That gives you on site backup but you will need to keep the backup up to date, either manually or by script/job). I do this for my home NAS which has mainly movies and music on it and I just backup any new files I add to a separate set of disks. Saves having an array with redundancy and a second backup unless you were not thinking of a second backup and using the redundancy of the array to keep your data safe....(clearly this is a trick question as this is not what you really should be doing ;), of course it is your data :D).

If you can then upgrade your 160GB drive and partition it to put the 250GB required second partition on there and leave the 4x2TB drives for data (??) however you wish to distribute it.

I have been using mdadm on CentOS 5.5 and Fedora 14 so there may be slight differences on Ubuntu.

Oh and I partition with parted / gparted to make sure the 4K sectors are taken in to account before building the array.

One more issue I have found is the advanced format drives, specifically the WD Caviar Green 1.5TB drives are not so good in arrays. Apart from the 4K sector consideration, I found them pausing every now and then including in the process of streaming a movie. This, I understand, is due to the aggressive head parking routines enabled to help the drive meet its "green" specs. I have read of people playing with the drives firmware to try and disable this 'feature' but have not tried it myself. I sold my 1.5TB greens and bought some 1TB Blacks. I will be staying away from "green" drives for my arrays but use them for the backup drives.

RB

RB
 
Thats software raid, you dont want to go near that if you can help it :)

Why not?

Intel/AMD now give us more cores than we know what do do with. Why fork out more money for a hardware RAID controller when we have cores with terrific performance per cost standing idle? We'll only see software RIAD become more accepted as time goes by.

Valid reasons for hardware raid are:
- I want a battery backed disk cache in front of the array without a big mains UPS.
- I want dual redundant or dual active/active controllers.
 
Because a simple thing as a driver update can loose your array, i'm not against software raid in a test environment and to a certain extent Raid1. But Raid0,5 should be a no no with any data you value.
It simple doesn't bring any of the benefits of using raid in the first place.
 
Because a simple thing as a driver update can loose your array, i'm not against software raid in a test environment and to a certain extent Raid1. But Raid0,5 should be a no no with any data you value.
It simple doesn't bring any of the benefits of using raid in the first place.

I thought this would be the sensible route ie. a hardware RAID card configuration if, as you say, you are building a RAID array for a server in a small business/enterprise set up where data stored on the array is valuable.

Surely with the amount of drives that the OP has mentioned it is more than a simple home storage solution?

I ask purely because I have 2 external 2 bay enclosures that I'm trying to think how to best utilise.... ie. RAID1 or RAID0 or JBOD - hence my interest in this thread :)
 
Because a simple thing as a driver update can loose your array, i'm not against software raid in a test environment and to a certain extent Raid1. But Raid0,5 should be a no no with any data you value.
It simple doesn't bring any of the benefits of using raid in the first place.

Maybe, but in Linux the RAID array config/info is stored on the disks, so as long as your data array is separate to your install which is what I am doing, I can reinstall linux, rescan for the array and mount it.
 
Maybe, but in Linux the RAID array config/info is stored on the disks, so as long as your data array is separate to your install which is what I am doing, I can reinstall linux, rescan for the array and mount it.

AAAAAAhhhhhhhh

So.... your 160Gb HDD is now going to be your OS install drive and the 4 x 2TB HDD's are going to be used for data storage (in RAID 5 ) only?
 
Because a simple thing as a driver update can loose your array, i'm not against software raid in a test environment and to a certain extent Raid1.

I have heard this reason quite a few times as a reason not to use software raid and have in certain instances used it myself. It is pretty valid for motherboard raid controllers but not so much for Linux. The raid arrays are transportable between machines with a reasonable closeness of mdadm version.

If the hardware raid card failed you are stuck either having to have a spare available or going out to find one. If a Linux software raid array fails due to a software issue, just put a prepatch version of that distro on another machine and mount there or revert the patch on the original machine (if possible). Software raid is much cheaper and the main area hardware raid beats it is offloading the processing to a dedicated processor and having a BBU for power outages. Also, don't forget that hardware raid controllers can also have firmware patchs which could also kill the array, although you would obviously hope the manufacturer have done some decent testing).

But Raid0,5 should be a no no with any data you value.

Totally agree with raid0. Not so much with raid5. Depends how many belts and braces you think would be required to hold your trousers up. Raid5 with 4 disks should not be used ?. Raid 6 for 4 disks and you may as well use Raid10. Of course, any data you value should be backed up to a non raid set anyway.

It simple doesn't bring any of the benefits of using raid in the first place.

Sorry ?. Raid0 and Raid5 do not bring the benefits of raid ?. Guess that depends on what you believe the benefits to be. For me, personally, in a home environment, both bring benefits over not having a raid array. Of course I could have missunderstood what you were trying to get at :).

Maybe, but in Linux the RAID array config/info is stored on the disks, so as long as your data array is separate to your install which is what I am doing, I can reinstall linux, rescan for the array and mount it.

Yep, very handy.

It really is horses for courses.... In a non-critical environment then software raid should be fine (not motherboard raid). In a critical environment you would want the reassurance of at least a BBU so hardware raid would be the obvious choice but then everything is usually a balance of cost / benefits.

Nikumba has not yet specified what the server will be used for, where it will be located (home / work) and what is the value of the data to be stored.

A lot of people here may well be from a corporate environment and so can share experiences on best practices from there whilst others are home users who may consider those practices to be pure overkill in their environments. Horses for courses.

Nikumba, can you enlighten us on;
Where the server will be used (Home/Work):
What will the array be used to store:
How important is the data for you:
Are you having a separate backup:
How much downtime can you tolerate if you lost the array:
What sort of budget is available:

With the answers to those questions, people can tailor the answer a lot better for your needs.

RB
 
Because a simple thing as a driver update can loose your array, i'm not against software raid in a test environment and to a certain extent Raid1. But Raid0,5 should be a no no with any data you value.
It simple doesn't bring any of the benefits of using raid in the first place.
I've been running a RAID 5 mdadm array in Ubuntu for over half a year now, and it has done the following:

  • Allowed me to create the array from two 1TB disks, put data on, add another disk, expand the array, add another disk, expand the array, put data on, expand the array, add another disk, expand the array and put more data on (i.e. start out with only two and build up to five as I was moving data off the same disks that were used to build the array).
  • Allowed me to pick up on a failing disk by looking at the SMART data, order a replacement, swap it in, and rebuild the array, all online, without e.g. mounting it read only.
  • Allowed me to add three disks over the course of several months doing the same as when I created the array, to a current disk count of eight 1TB F3s, so I didn't have to start out with more capacity than necessary.
  • Recently allowed me to replace the motherboard and rearrange the disk/controller mappings so now they are on Intel and Marvell controllers whereas before they were on Intel and JMicron controllers.
  • Allowed me to use various installs of various distros with the same array, each time having only to at most do apt-get/yum install mdadm and allow it to do the rest.
Software RAID (mdadm, not Windows or FakeRAID rubbish) offers all the advantages of hardware RAID (plus some of the above which you can't/don't get with hardware RAID) aside from:

  • Battery backed caches as RimBlock mentions.
  • True operating system independence.
  • CPU independence - though I've never even noticed the mdadm load let alone felt that it was slowing anything down.
 
Back
Top Bottom