Cheapest way to back up ~50TB?

Soldato
Joined
1 Sep 2005
Posts
10,001
Location
Scottish Highlands
We're soon to be putting in a SAN at work to hold about 50TB of data. My manager seems to think that we will be fine without backup, and that the budget can't afford it. I realise that the budget on this project is tight, but the idea of running without any form of backup gives me nightmares. So what is the cheapest way to back up 50TB of data? Ideally I also wouldn't want to keep having to swap tapes over etc, but if tapebackup would work out a lot cheaper then so be it. Any suggestions?
 
What sort of data is this??

At the moment the aoftware that is used at my present place of work compresses nicely by around 600% backing up as such doesnt require as much space which is a bonus... Does everything need backing up or are there certain file types and collections that don't really need backing up saving on the space required for the raw data?


Andy
 
I find it hard to believe a backup solution hasn't been included in the project budget! Although today's SANs have very high fault tolerance (multiple shelves, RAID6), they just complement a backup solution and don't replace it.

What would you do if you lost the SAN due to flood or fire? Or what if a user decides to wipe an entire volume? I am aware that a SAN solution has internal restore methods but I still sleep better at night knowing data is stored offsite, preferably in a fireproof safe.

I would recommend you purchase an LTO-5 Autoloader (48 port). Each tape holds 1.5TB (compressed). This would give you 72TB of data backup storage.
I couldn't find a UK price for a HP/Dell autoloader but if we convert US $ prices, you're talking £10-15K.

It might seem a lot but nothing in comparison to loosing all company data.

/edit The amount you need to backup does seem a lot. Is the 50GB the size of the SAN, or the amount of data you need to store?
 
+1

It really baffles me why IT management can't see this simple equation.
I see it all the time and I just work in IT support.

Money not spent now will cost much more if the company has a disaster.
 
+1

It really baffles me why IT management can't see this simple equation.
I see it all the time and I just work in IT support.

Money not spent now will cost much more if the company has a disaster.

+1

In the same field and see this on a weekly basis and then its too late....
 
What kind of Mickey Mouse company is this? If you can afford to buy a fibre channel SAN and load it with 50TB, you can afford to take proper backups. As redundant as SANs are, they're not completely infallible. Also, what would happen in case of fire? If this is even slightly important data, which judging by the expense it will be, then you need to be taking offsite backups.

The most cost effective solution will depend on what kind of data this is and how many servers it will be presented to etc, but generally you're going to be looking at enterprise level backup software (CommVault or suchlike) with an LTO 4 or 5 library. Sure, this will end up being a 5 figure layout, but anything less and you're just asking for trouble in event of a disaster.

I would strongly recommend you sell this to management in terms of your BCP (you do have one, right?) and put quite bluntly that in case of fire, without the backup solution, you're completely dead in the water.
 
The trouble is that if you are putting in a ~50TB capacity SAN then how to back that up should have been factored into the design from the beginning. It's not just a case of throwing in a LTO# auto-loader, media (not cheap when you consider how many tapes you would need for a reasonable cycle) and backup software licenses.

You also need to consider how service using the storage is going to use it and what limitations that its going to have on the backup.

What sort of time window will you have to back up the data?

- however good your tape drives are they will take time to back things up, (I have inherited systems before where the "nightly" backup ran for >24 hours!)

What type of data is it (e.g. is it flat file or Oracle database files) ?

- if it's Oracle data for instance you may need to backup using RMAN to allow for the service not to be taken down for the period of the backup ... this can add complexity (and normally licensing costs)

What is the rate of change of the data?

- Can you get away with a periodic full backup and then much smaller regular incremental backups ... but how often you do the full backup will depend on the answers to the following questions

What is the needed recovery time if a single file is lost?

- Which backup (full or incremental) holds the latest copy of the file and how will you know this information?

What is the needed recovery time if the whole lot is lost?

- Recovery of the full backup and then rolling though incrementals will take time.

How will you test that the backups are working, (initially and ongoing) ?

- As others have said this is very important ... if you cannot prove that your backups are recoverable then you have no backups.

The most complicated backup solution I was involved in was a multi-terabyte Lotus Domino cluster which needed that all Domino database files be recovered to an arbitrary point in time either individually or as part of a full recovery of the service. This had to be a consideration from the start, (as the requirements for the configuration to be able to do this affected the hardware requirements). Turned out to be a quite complex solution but worked well but it certainly wasn't cheap but both types of recoveries were proven which made the Customer happy, (until someone else took over the support and screwed up the backup software completely ... but that's another story).
 
Thanks for the replies everyone. :)

What sort of data is this??

Student work, mostly video, audio and image files.

I would recommend you purchase an LTO-5 Autoloader (48 port). Each tape holds 1.5TB (compressed). This would give you 72TB of data backup storage.
I couldn't find a UK price for a HP/Dell autoloader but if we convert US $ prices, you're talking £10-15K.

It might seem a lot but nothing in comparison to loosing all company data.

That isn't actually as much as I had imagined. This may be an option I need to look into further. Thanks for the suggestion.

/edit The amount you need to backup does seem a lot. Is the 50GB the size of the SAN, or the amount of data you need to store?

It is the size of the SAN (Which has yet to be implemented yet btw), but I expect it to fill up very quickly, then a significant amount of the data to be flushed each year to make way for new data. So it should be running at about 70-80% capacity most of the time.

50TB of data? Important data? How much is to lose (in money) if this were to disappear over night? :eek:

Money wise it is unlikely to cost little if anything at all. It would however cause major problems as it is going to be used for HD video footage for students. So if we lost it all, a whole load of technical extenuating circumstances forms would be filled and a lot of peoples heads would be on the block. It would also not do good things for my heart (I am user support, but will also be the Admin for the edit facility as central IT want me off their network (Political reasons and stupid policies.) )

What kind of Mickey Mouse company is this? If you can afford to buy a fibre channel SAN and load it with 50TB, you can afford to take proper backups. As redundant as SANs are, they're not completely infallible. Also, what would happen in case of fire? If this is even slightly important data, which judging by the expense it will be, then you need to be taking offsite backups.

The most cost effective solution will depend on what kind of data this is and how many servers it will be presented to etc, but generally you're going to be looking at enterprise level backup software (CommVault or suchlike) with an LTO 4 or 5 library. Sure, this will end up being a 5 figure layout, but anything less and you're just asking for trouble in event of a disaster.

I would strongly recommend you sell this to management in terms of your BCP (you do have one, right?) and put quite bluntly that in case of fire, without the backup solution, you're completely dead in the water.

It is for a large university. As mentioned above, central IT want me off the network for political and policy reasons. The facility will consist of 38 Mac Pros for HD video editing/ Image editing etc. The whole project has lacked any sort of project management from the start, and I have taken it upon myself (probably stupidly) to get it sorted as I will be end user support and admin (Yet am not in a position to make major purchasing decisions etc.. sigh).

BCP? I wish! Well actually there probably is one somewhere that won't have been looked at or updated for years. Unfortunately this is well above me, although I am more than willing to push the issue, especially with regards to the backup as this would have direct impact on my if it went wrong.

The trouble is that if you are putting in a ~50TB capacity SAN then how to back that up should have been factored into the design from the beginning. It's not just a case of throwing in a LTO# auto-loader, media (not cheap when you consider how many tapes you would need for a reasonable cycle) and backup software licenses.

You also need to consider how service using the storage is going to use it and what limitations that its going to have on the backup.

Fortunately the SAN hasn't been implemented yet. The tender document is currently being written/finalized. So the sooner I can push the issue of backup, the better as I may be able to convince the powers that be to include it in the tender document. At the moment they are trying to do it all on cost, so think they can shave off money by not having backup! You can all see this is wrong, I can see this is wrong, but I just need some options and ideas so I can present them to and convince management.


What sort of time window will you have to back up the data?

- however good your tape drives are they will take time to back things up, (I have inherited systems before where the "nightly" backup ran for >24 hours!)

The 'downtimes' will be about 18:00-07:00, weekends, half term & holidays.

What type of data is it (e.g. is it flat file or Oracle database files) ?

- if it's Oracle data for instance you may need to backup using RMAN to allow for the service not to be taken down for the period of the backup ... this can add complexity (and normally licensing costs)

Mainly video files (ProRes Quicktime), Audio Files and Image Files, with a few misc files in there as well.

What is the rate of change of the data?

Changes will occur at least every hour, every day of the working week. It will however go in cycles of heavy use (A lesson going on etc), right down to really minimal use.

- Can you get away with a periodic full backup and then much smaller regular incremental backups ... but how often you do the full backup will depend on the answers to the following questions

Incremental backup would be fine, and how I initially imagined it.

What is the needed recovery time if a single file is lost?

There shouldn't really need to be a need to recover single files, unless they were really important ones. If that was the case then I would say <48hours? The main need for backup would be incase of a total failure of the SAN.

- Which backup (full or incremental) holds the latest copy of the file and how will you know this information?

As above, this may not be necessary. Any suggestions to this would be welcome though.

What is the needed recovery time if the whole lot is lost?

- Recovery of the full backup and then rolling though incrementals will take time.

This would depend on the time of year. But a week should be acceptable.

How will you test that the backups are working, (initially and ongoing) ?

- As others have said this is very important ... if you cannot prove that your backups are recoverable then you have no backups.

This is something I am not sure about, so again suggestions would be greatly welcome.

The most complicated backup solution I was involved in was a multi-terabyte Lotus Domino cluster which needed that all Domino database files be recovered to an arbitrary point in time either individually or as part of a full recovery of the service. This had to be a consideration from the start, (as the requirements for the configuration to be able to do this affected the hardware requirements). Turned out to be a quite complex solution but worked well but it certainly wasn't cheap but both types of recoveries were proven which made the Customer happy, (until someone else took over the support and screwed up the backup software completely ... but that's another story).
 
If it was me in that situation I'd forget backup of that type of data in that environment and put the burden on the students to keep an off site copy of their work, I'm sure they all have laptops and external drives so it probably exists already for most. Trying to backup 50TB of data which doesn't have a huge business value is never going to be a business case that's easy to sell - so make it clear to the appropriate people exactly what you're providing and the limitations then make it their problem.

Or if you really must back it up, my solution for that would look something like a basic high capacity SAN (Satabeast maybe) in a different part of the campus and sync the files over the network nightly or something. It's not anything fancy but it provides an easy, quick to restore option at a low-ish cost. Won't survive a nuclear bomb but then again, there will probably be other things to worry about in that case...
 
How many users is this for?

Do you enforce disk quotas, if so how much?

Up to a potential 1000 users. Some may use it heavily, some hardly at all. I will be enforcing quotas depending on what modules the various users take. But it should average out at about 30Gb each.

If it was me in that situation I'd forget backup of that type of data in that environment and put the burden on the students to keep an off site copy of their work, I'm sure they all have laptops and external drives so it probably exists already for most. Trying to backup 50TB of data which doesn't have a huge business value is never going to be a business case that's easy to sell - so make it clear to the appropriate people exactly what you're providing and the limitations then make it their problem.

Or if you really must back it up, my solution for that would look something like a basic high capacity SAN (Satabeast maybe) in a different part of the campus and sync the files over the network nightly or something. It's not anything fancy but it provides an easy, quick to restore option at a low-ish cost. Won't survive a nuclear bomb but then again, there will probably be other things to worry about in that case...

Yeah I think that is the view management are taking. My problem with that is I know how unreliable the students can be. And if we did have a major failure, it would still be me getting it in the neck and having to recover the situation as user support/admin. It could also severely damage our students survey results which influence university rankings.

As for the cheap high capacity SAN, this is the solution that I initially imagined. Any idea what sort of price you are looking at for a Satabeast + 50TB of drives? Any other makes and models I should look at for cheap backup? One problem with this solution is that because I will be off the central network, we would have to lay our own cabling to the backup location, but this isn't too big a problem.
 
If it was me in that situation I'd forget backup of that type of data in that environment and put the burden on the students to keep an off site copy of their work, I'm sure they all have laptops and external drives so it probably exists already for most. Trying to backup 50TB of data which doesn't have a huge business value is never going to be a business case that's easy to sell - so make it clear to the appropriate people exactly what you're providing and the limitations then make it their problem.

This.

That is how many institutions that I've seen operating media courses in audio/video editing operate their systems. Many of them don't even provide the users with any on-site storage beyond a drive in the actual machine they would be working on - which given the size video projects can grow to I think is perfectly fair.

1Tb external drives are so dirt cheap these days even students can afford them, and if they don't value their work enough to make sensible backups of it then that is their own fault IMO.

Look at it this way, they would only **** it up once, then probably start doing things properly. Could also be a valuable life lesson for them, I don't think I've yet been in the situation of losing a clients data for projects I've worked on but that is because I learnt the hard way a long time ago to back things up correctly!
 
Agree with bigred and wij in this situation.

To do this properly will cost, and as said not being 'business' data you'll find it hard getting the buy in to do it.

It's not great, but fits the situation, the cost to implement will far outweigh the benefit.

Yeah I think that is the view management are taking. My problem with that is I know how unreliable the students can be.

Well this will be a life lesson for them in being responsible for yourself ;)
 
Last edited:
I'm inclined to agree with the later comments. As nice as it would be to provide a solution for them, I don't think it would be cost effective.

Perhaps offer them guidance and support on backing up their work, as well as finding appropriate tools for doing so?
I know it sounds like you're holding their hands a lot, but some people just aren't tech savvy.
 
I know people talk about fire / flood damage but at the end of the day if either happened I'm sure there would be a lot more at stake than some student videos gone missing, such as the uni / school being gone...

I can't see how you'd ever get it backed up if the files are changing that often unless you replicate constantly...
 
In a university setting, for undergraduate media data ... I would say a weekly backup to disk in another building at most should be provided by yourselves.

But also make sure that they have some way of getting their data backed up themselves to USB device without jumping through hoops to do so and strongly encourage them to do this.

In fact I would be tempted to say to them that they are supposed to back up their data but you do also do this weekly backup, but if they want you to restore, from said backup, then (a) there is no guarantee their file(s) are on it (in case they have been doing something at the time of the backup so a file has been skipped as it's in use) and (b) there is a charge of £25 to do the restore.

Charging them some beer money might encouraged them to do their own backups.
 
Back
Top Bottom