DS1813+ - Panic Stations

Soldato
Joined
21 Jul 2004
Posts
6,360
Location
Harrow, UK
About 20 minutes ago 3 out of 6 drives came up as failed, and now they are marked as "Not Initialized".

The setup was in RAID6, so it allowed for 2 drive failures but not 3.

I am really worrying now, so what are my next actions?

:( :( :(
 
Personally I would be backing up all your valuable stuff ASAP if you haven't already!

I can't... the volume isn't present so I can't see/access any of the data.

I have raised a support ticket with Synology and mentioned it on their Facebook and Twitter pages, so am just waiting for a response now. I am hoping they can manually mount 3 of the failed drives, as SMART checks on them look fine.
 
I find it remarkable that 3 drives have failed all at the same time and suspect the DS1813+ is having the problems.

Wish you the best of luck let us know how you get on.
 
I find it remarkable that 3 drives have failed all at the same time and suspect the DS1813+ is having the problems.

Wish you the best of luck let us know how you get on.

Yeah, the logs show that all 3 died at the same time which is highly unlikely. The drives are all WD Red and less than a year old, and they are designed for NAS and 24/7 operation.

The drives in question do show 1 bad sector each, which the system says it repaired, so I am guessing they have just dropped out of the volume and that Synology can manually put them back in.

I am somewhat tempted to purchase an identical NAS and set up a High Availability system, but that would set me back £1500 so am in two minds about that.
 
A little disappointed that Synology has not responded to my ticket yet. They advertise their products as suitable for business use, but with their current SLAs it is definitely not fit for that purpose.
 
This is why I don't RAID at home.

Multiple single drives backed up to multiple single drives.

Any 'critical' data, I keep on my Skydrive (and backed up)

-Hope you get it sorted. Do you have a backup of your files?
 
Sounds unlikely to be actual drive failure unless there was something like an electrical spike.

Not that its helpful for you but I only use RAID in my QNAP for convenience - so I can be up and running with minimum fuss if one of the internal drives fails - I have realtime replication to a USB disc in the back (only very small performance impact) and regularly take a snapshot via the USB copy port on the front to alternating drives so I have a good level of redundancy.
 
The data isn't backed up, because we are talking about 5TB worth of stuff. My NAS was the backup :(

Synology came back today saying 3 of the drives are showing errors (no **** Sherlock), and that I should take them out and scan them from my PC. If the scans come up alright, then the NAS could be faulty. If the scans show errors, then I should clone the drives onto new ones and put the new ones back into the NAS.

I've done an Advanced RMA with Western Digital already in preparation for this.
 
Analyse the drives using a live Linux distro.

If they all check out OK, attempt to assemble the array using the live distro. Forget the Synology unit the time being.
 
So I just got back home (where the NAS is), powered it off and back on again. A message came up saying the RAID has been recovered and if I wanted to do a scan.

Everything looks fine now. How scary!
 
You should invest in a couple of 3TB externals and backup your DATA there, a couple of toshibas (which use the very fine Hitachi drives inside) will only cost you £150 delivered.

No way would I trust that RAID array now after something like that happening, also means you can have a backup off-site in-case your house gets burgled or burnt down :eek: :p
 
You should invest in a couple of 3TB externals and backup your DATA there, a couple of toshibas (which use the very fine Hitachi drives inside) will only cost you £150 delivered.

No way would I trust that RAID array now after something like that happening, also means you can have a backup off-site in-case your house gets burgled or burnt down :eek: :p

The data is constantly changing, so any backup would have to be onsite.

The NAS is doing its scan now and will take 10 hours or so.
 
Lucky. Get a proper backup sorted ASAP.

I had thought that using RAID 6 was a proper backup, but I guess I was wrong.

Should I buy another identical NAS and have them update each other in real time, or am I better off getting a few large external drives?

/Edit: The scan finished earlier than expected and everything is alright. The NAS shows no bad sectors etc, so not sure what it was reporting before.
 
If its critical/commercially important and money within reason isn't an object get another NAS and have them real time sync (can use the 2nd ethernet probably).

Otherwise having some external drives doing realtime replication means you have an easy to read file system format contingency copy of the data should you need it without fuss.

RAID "is" a proper backup solution but my experience over many years is that you have just as likely chance of the whole raid failing as you do a single disc failing (neither being that usual as such) and that a seperate backup solution is better if its really critical data.
 
Last edited:
RAID is not a backup solution at all in itself, it is resiliency which is not the same thing. Backup allow you to recover an independent second(+) copy of your data separate from the main copy. Resiliency allows you to maintain access to your data despite a defined number of failures.

In a commercial environment you would likely have your storage using RAID to allow for resilient access to the data but then you should have some sort of backup on top of that which could involve automatic snap-shotting or syncing of the data on to another media device or manual backups. Consideration would need to be given into the recovery point objective (RPO), i.e. how much data change can be lost in recovery since it happened since the last backup, and the recovery time objective (RTO), i.e. how long to recover the backup. For example you could have a RTO of 4 hours, so you have to be able to recover the system in that time, with an RPO of 15mins, so the recovery needs to be to the data point a maximum of 15mins before the system went down so you need to be sending updates to the backup copy within this window.

Automatically syncing to another device can be good but you should be aware of the limitations in that if there is an issue with corruption or deletion of data these things are likely to have been replicated to the second copy.
 
Last edited:
Back
Top Bottom