RAID 5 Whoopsie...

Soldato
OP
Joined
5 Jan 2009
Posts
4,759
Not sure if you missed my suggestion of using linux mdadm, I have successfully recovered a failed RAID5 array using this method before.

https://raid.wiki.kernel.org/index.php/RAID_Recovery
Thanks, I've been reading up on that. Is this done in a separate system with the disks mounted separately? My disks are SAS meaning I unfortunately can't transfer them to a spare PC etc. I am wondering if I can't just get a PCI SAS Controller and connect them up, but I wary that I'll further damage the data.
 
Soldato
Joined
29 Dec 2002
Posts
7,234
Pike is right imo, SAS card wise something with two chans will do up to 4 SAS drives (2 per chan) and in IT mode literally just present the drives to the OS.

Before you go any further, have you actually looked at why the initial rebuild failed? Any chance it’s a faulty replacement drive before you go any further or something trivial such as a cable fault?
 
Soldato
OP
Joined
5 Jan 2009
Posts
4,759
Pike is right imo, SAS card wise something with two chans will do up to 4 SAS drives (2 per chan) and in IT mode literally just present the drives to the OS.

Before you go any further, have you actually looked at why the initial rebuild failed? Any chance it’s a faulty replacement drive before you go any further or something trivial such as a cable fault?

I don't know why the rebuild failed, but then I don't know why the array failed either. I assumed it was a single drive failure given the noise it made. Right now, I'm happy to ditch SAS RAID 5 entirely; all I want is the data back. If I can't, I'll still be ditching this setup. Tempted to get 4 1TB SSDs and go RAID 6 or 10. 2TB is probably enough, although it was nice having 8TB and not having to worry about running out of space. Wish I'd put some energy into backups mind you...
 
Soldato
Joined
18 Oct 2002
Posts
14,007
Location
Sandwich, Kent
What's the point in doing a RAID setup if you can't guarantee you can recover from a faulty disk?

Is RAID 5 not reliable? If not, why does it even exist?

I've only ever done RAID 1 setups - as all I wanted was redundancy.
 
Soldato
OP
Joined
5 Jan 2009
Posts
4,759
What's the point in doing a RAID setup if you can't guarantee you can recover from a faulty disk?

Is RAID 5 not reliable? If not, why does it even exist?

I've only ever done RAID 1 setups - as all I wanted was redundancy.

Complacency, laziness, naivety. I can go on, but what will it achieve?
 
Caporegime
Joined
18 Oct 2002
Posts
25,289
Location
Lake District
Reason why it failed is you probably had Uncorrectable read errors on another drive, so unless you periodically do a patrol read they can go undetected until the controller tries and rebuild data from all the blocks on the array, then when the controller can't reconstruct the data properly due to a URE, you're SOL.
 
Soldato
Joined
26 Sep 2007
Posts
4,137
Location
Newcastle
What's the point in doing a RAID setup if you can't guarantee you can recover from a faulty disk?

Is RAID 5 not reliable? If not, why does it even exist?

You can as long as you're monitoring the array on a regular basis checking everything is healthy, most controllers will do this automatically and then flag any alerts in HP ACU.
 
Soldato
Joined
29 Dec 2002
Posts
7,234
What's the point in doing a RAID setup if you can't guarantee you can recover from a faulty disk?

Is RAID 5 not reliable? If not, why does it even exist?

I've only ever done RAID 1 setups - as all I wanted was redundancy.

Firstly RAID is not a backup in itself, you may have a backup on a RAID array, but never mistake redundancy for a backup. R5 fell out of favour when it became increasingly obvious that it was becoming mathematically improbable for a perfect rebuild to occur as drive sizes scaled up, between write errors and the risk of additional drives failing during rebuild due to them being at similar points in the life cycle and the extra load created during a rebuild, it’s not acceptable to run R5 in a lot of places/scenario’s now.

R0 used to be a thing for speed, SSD’s fixed that. Then it was a thing for capacity as large SSD’s were expensive (ignore the issues with TRIM, people still do/did it), now NVMe has largely fixed that and capacity has reached a point where it’s reasonably priced. R1 still has a place in some scenario’s, but realistically we are approaching a point where fast connections and cloud based storage are viable options for an offsite backup and becoming viable for ‘disposable’ data eg media.

Anyway, @Amraam how did you get on?
 
Associate
Joined
19 Jul 2011
Posts
2,343
What's the point in doing a RAID setup if you can't guarantee you can recover from a faulty disk?

Is RAID 5 not reliable? If not, why does it even exist?

I've only ever done RAID 1 setups - as all I wanted was redundancy.

Don't forget RAID in its various flavours was invented decades ago.
 
Soldato
OP
Joined
5 Jan 2009
Posts
4,759
Firstly RAID is not a backup in itself, you may have a backup on a RAID array, but never mistake redundancy for a backup. R5 fell out of favour when it became increasingly obvious that it was becoming mathematically improbable for a perfect rebuild to occur as drive sizes scaled up, between write errors and the risk of additional drives failing during rebuild due to them being at similar points in the life cycle and the extra load created during a rebuild, it’s not acceptable to run R5 in a lot of places/scenario’s now.

R0 used to be a thing for speed, SSD’s fixed that. Then it was a thing for capacity as large SSD’s were expensive (ignore the issues with TRIM, people still do/did it), now NVMe has largely fixed that and capacity has reached a point where it’s reasonably priced. R1 still has a place in some scenario’s, but realistically we are approaching a point where fast connections and cloud based storage are viable options for an offsite backup and becoming viable for ‘disposable’ data eg media.

Anyway, @Amraam how did you get on?
I gave up for a while, leaving the whole thing powered off. Currently have the disks out whilst I boot up the controller to set them to pass through... if I can work out how to on this P410 controller, that is.
 
Last edited:
Associate
Joined
13 Oct 2009
Posts
238
Location
Cumbria
Reason why it failed is you probably had Uncorrectable read errors on another drive, so unless you periodically do a patrol read they can go undetected until the controller tries and rebuild data from all the blocks on the array, then when the controller can't reconstruct the data properly due to a URE, you're SOL.

I've had this before, it was called a RAID puncture and it was diagnosed from the RAID controller logs. In my case I think only one disk completely failed and another was flagged as "predictive failure". I was lucky that the puncture seemed to be in the free space and the filesystem was healthy enough to clone to a single SSD, delete the punctured array, re-create it with new disks and clone the data back. Replacing faulty disks in the existing array wouldn't fix it.

Doesn't sound very promising if you're seeing a corrupt filesystem though. Maybe you could clone the whole drive to an image and run file recovery tools on it? 8TB is going to be challenge.
 
Soldato
OP
Joined
5 Jan 2009
Posts
4,759
I've had this before, it was called a RAID puncture and it was diagnosed from the RAID controller logs. In my case I think only one disk completely failed and another was flagged as "predictive failure". I was lucky that the puncture seemed to be in the free space and the filesystem was healthy enough to clone to a single SSD, delete the punctured array, re-create it with new disks and clone the data back. Replacing faulty disks in the existing array wouldn't fix it.

Doesn't sound very promising if you're seeing a corrupt filesystem though. Maybe you could clone the whole drive to an image and run file recovery tools on it? 8TB is going to be challenge.

That's what I'm trying to do now. Doing it at work where I have oodles of TBs. Not holding my breath... :(
 
Soldato
OP
Joined
5 Jan 2009
Posts
4,759
That's what I'm trying to do now. Doing it at work where I have oodles of TBs. Not holding my breath... :(
Yikes, it's a slow process using this celeron CPU. It's been running all day capturing a compressed image so I can fit it on a server here at work. It's on ~5% :/ Reckons it'll be another 9 days until completion...
 
Soldato
OP
Joined
5 Jan 2009
Posts
4,759
Four days later, the compressed image has been captured. 7.28TB array, of which about 3.5TB used. Compressed to a 2.05TB image file. R-Studio is now scanning this image. Fingers crossed...
 
Soldato
OP
Joined
5 Jan 2009
Posts
4,759
Image mounted and scanned. Loads of files, including large mp4s etc marked green and healthy. Recover them and they won't play or open. What a disaster...

Even 5MB JPEGs are FUBAR'd.

Many many files are reporting in R-Studio as green healthy and can be recovered. Even small 10kb .ini files are wrecked though. Try to open them in Notepad++ etc. and it's corrupt. I'm looking at 100% data loss here. Man I'm an idiot for not having a back up...
 
Last edited:
Soldato
OP
Joined
5 Jan 2009
Posts
4,759
There is at least a small ray of light. I installed this server back in 2016 and some of the data I've lost is still (hopefully) on the 1TB disk of my last desktop. Fingers crossed and files can be recovered from there, but it's still 3 years old meaning all my raw photos/gopro footage of the last couple of holidays is gone forever.
 
Soldato
Joined
30 Jul 2005
Posts
19,423
Location
Midlands
About 20 hours before it said the 'parity initialisation' it reached 100% and ended on 'FAILED'.

with such a long rebuild time does raid 5 even make sense on 4tb drives? was handy with 36gb sas drives back in the day but these days i dunno, raid 10 seems more appealing or just a raid 1 mirror?
 
Soldato
Joined
29 Dec 2002
Posts
7,234
RAID 5 stopped making sense long before 4TB drives, the example I remember reading many years ago was based on 3TB drives, it was mathematically highly improbable to get a perfect rebuild based on that sort of drive capacity. This was around the time pooling was becoming a thing, WHS/UnRAID etc. made more sense for media storage which is what the majority of home users were looking for at the time.
 
Soldato
Joined
30 Jul 2005
Posts
19,423
Location
Midlands
how long would a raid 5 rebuild take if the drives are 300gb each?
since raid is going away these days it makes sense why the marked is flooded with second hand raid controllers at dirt cheapo prices.
 
Soldato
Joined
29 Dec 2002
Posts
7,234
You see the irony in asking about rebuild times for old mechanical drives in a thread started because of a failure of a mechanical R5 array using old drives to rebuild?

The main three reasons for R5 are redundancy, space and speed. Given 300GB drives aren’t really ticking the last two boxes, you’d probably be better looking at something flash based and if redundancy is a priority running R1. That said the failure rate on flash should be significantly lower than multiple mechanical drives anyway.
 
Back
Top Bottom