Cheapest way to back up ~50TB?

Soldato
Joined
1 Sep 2005
Posts
10,001
Location
Scottish Highlands
We're soon to be putting in a SAN at work to hold about 50TB of data. My manager seems to think that we will be fine without backup, and that the budget can't afford it. I realise that the budget on this project is tight, but the idea of running without any form of backup gives me nightmares. So what is the cheapest way to back up 50TB of data? Ideally I also wouldn't want to keep having to swap tapes over etc, but if tapebackup would work out a lot cheaper then so be it. Any suggestions?
 
Thanks for the replies everyone. :)

What sort of data is this??

Student work, mostly video, audio and image files.

I would recommend you purchase an LTO-5 Autoloader (48 port). Each tape holds 1.5TB (compressed). This would give you 72TB of data backup storage.
I couldn't find a UK price for a HP/Dell autoloader but if we convert US $ prices, you're talking £10-15K.

It might seem a lot but nothing in comparison to loosing all company data.

That isn't actually as much as I had imagined. This may be an option I need to look into further. Thanks for the suggestion.

/edit The amount you need to backup does seem a lot. Is the 50GB the size of the SAN, or the amount of data you need to store?

It is the size of the SAN (Which has yet to be implemented yet btw), but I expect it to fill up very quickly, then a significant amount of the data to be flushed each year to make way for new data. So it should be running at about 70-80% capacity most of the time.

50TB of data? Important data? How much is to lose (in money) if this were to disappear over night? :eek:

Money wise it is unlikely to cost little if anything at all. It would however cause major problems as it is going to be used for HD video footage for students. So if we lost it all, a whole load of technical extenuating circumstances forms would be filled and a lot of peoples heads would be on the block. It would also not do good things for my heart (I am user support, but will also be the Admin for the edit facility as central IT want me off their network (Political reasons and stupid policies.) )

What kind of Mickey Mouse company is this? If you can afford to buy a fibre channel SAN and load it with 50TB, you can afford to take proper backups. As redundant as SANs are, they're not completely infallible. Also, what would happen in case of fire? If this is even slightly important data, which judging by the expense it will be, then you need to be taking offsite backups.

The most cost effective solution will depend on what kind of data this is and how many servers it will be presented to etc, but generally you're going to be looking at enterprise level backup software (CommVault or suchlike) with an LTO 4 or 5 library. Sure, this will end up being a 5 figure layout, but anything less and you're just asking for trouble in event of a disaster.

I would strongly recommend you sell this to management in terms of your BCP (you do have one, right?) and put quite bluntly that in case of fire, without the backup solution, you're completely dead in the water.

It is for a large university. As mentioned above, central IT want me off the network for political and policy reasons. The facility will consist of 38 Mac Pros for HD video editing/ Image editing etc. The whole project has lacked any sort of project management from the start, and I have taken it upon myself (probably stupidly) to get it sorted as I will be end user support and admin (Yet am not in a position to make major purchasing decisions etc.. sigh).

BCP? I wish! Well actually there probably is one somewhere that won't have been looked at or updated for years. Unfortunately this is well above me, although I am more than willing to push the issue, especially with regards to the backup as this would have direct impact on my if it went wrong.

The trouble is that if you are putting in a ~50TB capacity SAN then how to back that up should have been factored into the design from the beginning. It's not just a case of throwing in a LTO# auto-loader, media (not cheap when you consider how many tapes you would need for a reasonable cycle) and backup software licenses.

You also need to consider how service using the storage is going to use it and what limitations that its going to have on the backup.

Fortunately the SAN hasn't been implemented yet. The tender document is currently being written/finalized. So the sooner I can push the issue of backup, the better as I may be able to convince the powers that be to include it in the tender document. At the moment they are trying to do it all on cost, so think they can shave off money by not having backup! You can all see this is wrong, I can see this is wrong, but I just need some options and ideas so I can present them to and convince management.


What sort of time window will you have to back up the data?

- however good your tape drives are they will take time to back things up, (I have inherited systems before where the "nightly" backup ran for >24 hours!)

The 'downtimes' will be about 18:00-07:00, weekends, half term & holidays.

What type of data is it (e.g. is it flat file or Oracle database files) ?

- if it's Oracle data for instance you may need to backup using RMAN to allow for the service not to be taken down for the period of the backup ... this can add complexity (and normally licensing costs)

Mainly video files (ProRes Quicktime), Audio Files and Image Files, with a few misc files in there as well.

What is the rate of change of the data?

Changes will occur at least every hour, every day of the working week. It will however go in cycles of heavy use (A lesson going on etc), right down to really minimal use.

- Can you get away with a periodic full backup and then much smaller regular incremental backups ... but how often you do the full backup will depend on the answers to the following questions

Incremental backup would be fine, and how I initially imagined it.

What is the needed recovery time if a single file is lost?

There shouldn't really need to be a need to recover single files, unless they were really important ones. If that was the case then I would say <48hours? The main need for backup would be incase of a total failure of the SAN.

- Which backup (full or incremental) holds the latest copy of the file and how will you know this information?

As above, this may not be necessary. Any suggestions to this would be welcome though.

What is the needed recovery time if the whole lot is lost?

- Recovery of the full backup and then rolling though incrementals will take time.

This would depend on the time of year. But a week should be acceptable.

How will you test that the backups are working, (initially and ongoing) ?

- As others have said this is very important ... if you cannot prove that your backups are recoverable then you have no backups.

This is something I am not sure about, so again suggestions would be greatly welcome.

The most complicated backup solution I was involved in was a multi-terabyte Lotus Domino cluster which needed that all Domino database files be recovered to an arbitrary point in time either individually or as part of a full recovery of the service. This had to be a consideration from the start, (as the requirements for the configuration to be able to do this affected the hardware requirements). Turned out to be a quite complex solution but worked well but it certainly wasn't cheap but both types of recoveries were proven which made the Customer happy, (until someone else took over the support and screwed up the backup software completely ... but that's another story).
 
How many users is this for?

Do you enforce disk quotas, if so how much?

Up to a potential 1000 users. Some may use it heavily, some hardly at all. I will be enforcing quotas depending on what modules the various users take. But it should average out at about 30Gb each.

If it was me in that situation I'd forget backup of that type of data in that environment and put the burden on the students to keep an off site copy of their work, I'm sure they all have laptops and external drives so it probably exists already for most. Trying to backup 50TB of data which doesn't have a huge business value is never going to be a business case that's easy to sell - so make it clear to the appropriate people exactly what you're providing and the limitations then make it their problem.

Or if you really must back it up, my solution for that would look something like a basic high capacity SAN (Satabeast maybe) in a different part of the campus and sync the files over the network nightly or something. It's not anything fancy but it provides an easy, quick to restore option at a low-ish cost. Won't survive a nuclear bomb but then again, there will probably be other things to worry about in that case...

Yeah I think that is the view management are taking. My problem with that is I know how unreliable the students can be. And if we did have a major failure, it would still be me getting it in the neck and having to recover the situation as user support/admin. It could also severely damage our students survey results which influence university rankings.

As for the cheap high capacity SAN, this is the solution that I initially imagined. Any idea what sort of price you are looking at for a Satabeast + 50TB of drives? Any other makes and models I should look at for cheap backup? One problem with this solution is that because I will be off the central network, we would have to lay our own cabling to the backup location, but this isn't too big a problem.
 
I'm inclined to agree with the later comments. As nice as it would be to provide a solution for them, I don't think it would be cost effective.

Perhaps offer them guidance and support on backing up their work, as well as finding appropriate tools for doing so?
I know it sounds like you're holding their hands a lot, but some people just aren't tech savvy.

The problem with this is it would cause more problems than it would solve. I would end up spending much of my time helping people back up work etc (A lot of user may not be tech savy at all, so I need to make it as fool proof as possible.. Also if we encouraged students to all get external hard-drives, they are likely to start working off them. This is ok for audio and still imagery, but if they did it with video work then it is going to cause me a world of grief. A combination of different makes, models, connections and file systems would again lead to my time being taken up with fault finding. I don't mind students backing up for their own peace of mind or to recover work whic htye themselves have messed up, but relying on them in the case of a major failure may not be the best thing.

I know people talk about fire / flood damage but at the end of the day if either happened I'm sure there would be a lot more at stake than some student videos gone missing, such as the uni / school being gone...

I can't see how you'd ever get it backed up if the files are changing that often unless you replicate constantly...

It's not just major fire or floods that would be the issue though. Water damage (And with the amount of luck I gave had with leaks and water damage recently, this is one that worries me), electrical problems, virus etc may all lead to a major data loss without wiping out the whole school.

The backup wouldn't need to be kept 100% up to date, but being able to recover say 1 week back should be sufficient and would save a lot of work.

I really cant see the point of the install here of a 50Tb SAN. If you are saying that all the students have local storage to work on, and that you are thinking of making it their issue to secure their data in the event of a failure, what is the point in a centrally managed storage soloution, which wont be managed properly.

With that amount of data, either it gets managed properly with scheduled backup windows and proper management buy in, or dont do it at all.

The students won't have local storage to work on. All of the students need to be able to access any of the computers and carry on their work. The options I see for this are; External hardrives, and as I mentioned above if the students provide their own this will cause more problems than it is worth due to the mix of different makes, models, connections and file systems. So we could provide them, but take 1000 students and all give them all a drive and you have just spend as much as or more than a decent SAN. Or implement a SAN, which will be managed in terms of user quotas, and hopefully backed up if I get my way.

Sorry to butt in .. i'm new here.

Will the students be storing and working off directly on the SAN or will they be working on their local disk and then they would need to do the backup themselves.

If they will be working directly on the SAN, with the students reading/writing large HD files - are you sure that the SAN will cope? That is from a bandwidth perspective and also from the NAS perspective ( that is the OS firmware on the NAS ).

They will be working directly off the SAN. Yeah the bandwidth and access times etc have all been carefully calculated and the SAN will be more than able to cope. We are aiming to get enough bandwidth from the SAN to be able to saturate Gigabit Ethernet for 26 machines at the same time. This is enough to provide 5 streams of Pro-res 422 for each machine at the same time.
 
Going back a bit - for a 42TB SATA Beast - depending on your relationship and how much discount you can extract, maybe £23k?

That's not too bad.

Dammm that's expensive, we managed to get one for a bit more than half of that. :D

And that is even better! I certainly think that falls into the land of affordable considering the overall budget of the SAN install.

depends if it does the OP out of a job by removing the management infrastructure :p

Don't worry about my job, network admin & user support is just small part of it. ;) My of my time is spent being a photography/print/sound/theatre/video technician and everything else with a plug technician.

I don't think a university will be lacking bandwidth! They're normally on multiple gbit links already. I imagine the Janet network is 10gbit + by now?

For one, the facility will be off the central network, so will be accessing the internet via wireless. But more importantly, I don't think I would make myself very popular hogging the entire Janet or university network!

Just a question is it 1 room where this information will be accessed or is it everywhere?

I'm sorry here if I'm missing the point but if students are accessing this sort of information (30GB) accross even gigabit links theres going to be huge data link bottle necks. If files (movies/music) is few GB each and a room of lets say 24 PCS is been used ..... theres hugh problems with speed here.

I work in a college and we have issues with 1 room (Music production) alough the links arn't updated to GB but we are looking at new Apple Mac solutions and local storage as network storage is just too much and bandwidth issues occur.

Put the emphasis onto the student, give them all a 250GB USB disk. Much faster than network and cheaper!!!

I am shocked that your talking about 50TB here and not a backup solution is in the plan. I wouldn't dare touch a system where theres no solutions in place for data loss. Nasty...

Yeah, the facility will consist of one room split into 3 areas (Input, process and output) with a total of 38 computers. There won't be a bottleneck for our needs. We will be doing video editing off the SAN, transcoding footage into ProRes422 (1080p 25fps) which runs at about 17MBps. A gigabit connection will be more than enough for say 4 streams of footage (The Mac pros will start to struggle before we saturate the connection.) This just leaves the SAN, which we are speccing to support this sort of data rate over the 38 computers which is the main reason it is going to be rather expensive.

Anyway, the Satabeast (or equivalent. Any other suggestions are wlecome) looks very promising and should be possible for the budget.
 
Equallogic 6500E http://www.equallogic.com/products/default.aspx?id=7905
If you don't mind SATA drives.

Thanks for the suggestion. Any idea of the price for a fully loaded one with 1TB drives?

What sort of policies are the workstations applying? Sorry did you say they were Windows or Apple OS? Do the students log into the MACS? I find that a lot of the issues we get is login speeds, still on XP but in 6 months time upgrading to GB + WIN7 VD.

There will be various policies depending on the user group, with some resetting preference files and doing cleanup duties. Nothing too intensive, and anything that does require a bit of time will be done overnight.

The server is running OSX Server, and all the clients are running OSX 10.6. Yeah all the students log in, authenticating through opendirectory.
 
Back
Top Bottom