RAID 5 Whoopsie...

Soldato
Joined
5 Jan 2009
Posts
4,760
...and by whoopsie I mean total **** up on my part.

I have an 8TB RAID 5 array comprising of 3 4TB SAS disks, controlled by an HP P410 controller. I've stupidly never set a back up. Ever. I came home to find the D drive missing on my server and a dreaded repeating clicking sound on one of the drives. Shut down, loaded up the P410 utility, and yup, disk 2 MISSING. Other two are fine, yet the array is in a failed state. I replaced the failed disk, let the whirring/activity stop, then boot back into the utility. Disk is showing but the array is still failed with a warning about data being lost. I brought the array back online and let the rebuild process complete. It failed. I can see the D drive on the OS but all the files and folders are corrupt. Gutted.

So, now I have the task of trying to salvage anything I can. They're SAS drives, so I can't simply plug them into another PC. Shall I buy another cheap RAID SAS controller and plug the disks in separately and back them up before trying some form of software RAID data recovery?

Any advice would be greatly appreciated, but please lay off the "you should have backed up". I know that, and hindsight is a bitch. I'll not be cheap/lazy next time.
 
As much as I really want to help you here. Moving between controllers especially HP is asking for more trouble. If you replaced the failed disk (you are sure it was the right disk you pulled right) and it failed to rebuild the array then usually that is the time to flatten it and restore from a backup.

I do hope you can bring it up but ultimately prepare yourself for the worst.

Yeah I'm pretty much bringing myself around to the idea that it's all gone, especially given the fact the majority of the files were several GBs. It's main purpose was my PLEX and file server, and it housed all my GoPro and OBSStudio footage. It's more the WIP premiere videos and holiday/airshow RAW photos that I care about the most, but thankfully most of the photos I marked as keep had been edited and exported onto Google Photos already.
 
File Scavenger

http://www.quetek.com/

Not used the raid version (it may be combined now) but I paid for that years back and it saved me from a failed WD My Book.

Since then they added a 2nd piece of software : Disk Recoup

So even if the logical drive had failed, but Server 2012 can see the 8TB array albeit with corrupt files, this might work to recover the corrupt files? From OS level? I was thinking I'd have to pull the drives, put them into a SAS controller as separate disks, bit by bit back them up, and use a RAID reconstructor app to salvage files
 
Can you take the HP controller out of the PC it is in (assume NAS) and plug that and the drives in to another PC ?
Always wise to have drives labelled so you know which one you are pulling. (I say this I don't think I did this in my home NAS)
My business server is a Dell Poweredge so its pretty easy to tell which drive you need to pull, don't know why I didn't get a Dell server to use as a NAS at home tbh.

Mine was around £400 off ebay, just had to put drives in it.

It's currently in an HP microserver. I haven't entirely ruled out the P410 controller at fault, however I find it unlikely given the noise the disk was making as it failed and the same disk was marked as MISSING in the controller screen. The only reason I used SAS drives is because they were given to me at work as they were unneeded. Couldn't argue with £600 of free enterprise disks, but I would have been in a lot less mess had they been regular SATA :/
 
Not sure but the Free version will show you the files but AFAIR will not recover them or possibly lets you recover one at a time (its been that long).

And I think the normal version and raid version are one in the same now.

Just had a look and the RAID seems to be a pro service, rather than a consumer product. I'll keep checking.
 
I successfully used software from runtime.org to rebuild a fubar raidset of 3 disks. Was about 10 years ago but the principles of Raid have not changed since then.

But was this done with three independent disks, or just from the OS on the logical drive?
 
Not sure if you missed my suggestion of using linux mdadm, I have successfully recovered a failed RAID5 array using this method before.

https://raid.wiki.kernel.org/index.php/RAID_Recovery
Thanks, I've been reading up on that. Is this done in a separate system with the disks mounted separately? My disks are SAS meaning I unfortunately can't transfer them to a spare PC etc. I am wondering if I can't just get a PCI SAS Controller and connect them up, but I wary that I'll further damage the data.
 
Pike is right imo, SAS card wise something with two chans will do up to 4 SAS drives (2 per chan) and in IT mode literally just present the drives to the OS.

Before you go any further, have you actually looked at why the initial rebuild failed? Any chance it’s a faulty replacement drive before you go any further or something trivial such as a cable fault?

I don't know why the rebuild failed, but then I don't know why the array failed either. I assumed it was a single drive failure given the noise it made. Right now, I'm happy to ditch SAS RAID 5 entirely; all I want is the data back. If I can't, I'll still be ditching this setup. Tempted to get 4 1TB SSDs and go RAID 6 or 10. 2TB is probably enough, although it was nice having 8TB and not having to worry about running out of space. Wish I'd put some energy into backups mind you...
 
What's the point in doing a RAID setup if you can't guarantee you can recover from a faulty disk?

Is RAID 5 not reliable? If not, why does it even exist?

I've only ever done RAID 1 setups - as all I wanted was redundancy.

Complacency, laziness, naivety. I can go on, but what will it achieve?
 
Firstly RAID is not a backup in itself, you may have a backup on a RAID array, but never mistake redundancy for a backup. R5 fell out of favour when it became increasingly obvious that it was becoming mathematically improbable for a perfect rebuild to occur as drive sizes scaled up, between write errors and the risk of additional drives failing during rebuild due to them being at similar points in the life cycle and the extra load created during a rebuild, it’s not acceptable to run R5 in a lot of places/scenario’s now.

R0 used to be a thing for speed, SSD’s fixed that. Then it was a thing for capacity as large SSD’s were expensive (ignore the issues with TRIM, people still do/did it), now NVMe has largely fixed that and capacity has reached a point where it’s reasonably priced. R1 still has a place in some scenario’s, but realistically we are approaching a point where fast connections and cloud based storage are viable options for an offsite backup and becoming viable for ‘disposable’ data eg media.

Anyway, @Amraam how did you get on?
I gave up for a while, leaving the whole thing powered off. Currently have the disks out whilst I boot up the controller to set them to pass through... if I can work out how to on this P410 controller, that is.
 
Last edited:
I've had this before, it was called a RAID puncture and it was diagnosed from the RAID controller logs. In my case I think only one disk completely failed and another was flagged as "predictive failure". I was lucky that the puncture seemed to be in the free space and the filesystem was healthy enough to clone to a single SSD, delete the punctured array, re-create it with new disks and clone the data back. Replacing faulty disks in the existing array wouldn't fix it.

Doesn't sound very promising if you're seeing a corrupt filesystem though. Maybe you could clone the whole drive to an image and run file recovery tools on it? 8TB is going to be challenge.

That's what I'm trying to do now. Doing it at work where I have oodles of TBs. Not holding my breath... :(
 
That's what I'm trying to do now. Doing it at work where I have oodles of TBs. Not holding my breath... :(
Yikes, it's a slow process using this celeron CPU. It's been running all day capturing a compressed image so I can fit it on a server here at work. It's on ~5% :/ Reckons it'll be another 9 days until completion...
 
Four days later, the compressed image has been captured. 7.28TB array, of which about 3.5TB used. Compressed to a 2.05TB image file. R-Studio is now scanning this image. Fingers crossed...
 
Image mounted and scanned. Loads of files, including large mp4s etc marked green and healthy. Recover them and they won't play or open. What a disaster...

Even 5MB JPEGs are FUBAR'd.

Many many files are reporting in R-Studio as green healthy and can be recovered. Even small 10kb .ini files are wrecked though. Try to open them in Notepad++ etc. and it's corrupt. I'm looking at 100% data loss here. Man I'm an idiot for not having a back up...
 
Last edited:
There is at least a small ray of light. I installed this server back in 2016 and some of the data I've lost is still (hopefully) on the 1TB disk of my last desktop. Fingers crossed and files can be recovered from there, but it's still 3 years old meaning all my raw photos/gopro footage of the last couple of holidays is gone forever.
 
You see the irony in asking about rebuild times for old mechanical drives in a thread started because of a failure of a mechanical R5 array using old drives to rebuild?

The main three reasons for R5 are redundancy, space and speed. Given 300GB drives aren’t really ticking the last two boxes, you’d probably be better looking at something flash based and if redundancy is a priority running R1. That said the failure rate on flash should be significantly lower than multiple mechanical drives anyway.

Yup, and I've learned my lesson. All my data is gone. Whilst I could take it to a pro and probably get some back, it's not life critical that I do so I'll not bother and chalk it up to user (me) stupidity. No more RAID for me, as I don't really need it and of course rely on backups! I'm probably going to rely on cloud storage from now on, and if I do look to build a plex server then I'll just house my movies on a some 1TB SSDs when they become cheap.
 
Back
Top Bottom