How to transfer ~400MB/sec between 2 computers?

dan_aka_jack · 6 Mar 2007 at 09:11

Hi there,

What are my options if I want to connect two computers with a pipe capable of about 400MBytes/sec? Are my only two options 10-gigabit Ethernet and 4Gb Fibre Channel? Is it possible to get PCI-Express 10-gigabit Ethernet NICs in the UK? If so, how much are they?! (I can only find PCI-X NICs which start at about £600).

(The reason I'm asking is because I'm researching the posibility of building a system for editing uncompressed high definition video and I'd like the storage to be in a dedicated Linux box connected to my WinXP workstation. The dedicated Linux box would run a software RAID-6 array and also perform regular backups).

Thanks,
Jack

growse · 6 Mar 2007 at 09:53

Sun have just made a 4x PCI-E 10Gbit card, the fibre version is out now and retails at about $999. Copper is later this year I think.

It's an interesting route to go down, you might want to look into SAN technology if you're inclined to go down that road.

You'd need a *lot* of discs and a good adapter to get 400MB/s out of RAID-6 though (imo).... RAID-10 might be a better solution as it doesn't suffer from poor writes. Possible though if you throw enough money at it

Will_3rd · 6 Mar 2007 at 09:55

Firewire?

Uhtred · 6 Mar 2007 at 10:04

raid 10 would be better for read/write speeds but would suffer with capacity.

Cuchulain · 6 Mar 2007 at 10:07

Will_3rd said:
Firewire?

That's 400 megabit, not megabyte, HUGE difference

Will_3rd · 6 Mar 2007 at 10:09

Cuchulain said:
That's 400 megabit, not megabyte, HUGE difference

Ah OK

49.152MBytes per second

sniper007 · 6 Mar 2007 at 10:16

Are you rich?

dan_aka_jack · 6 Mar 2007 at 10:34

Hi,

Thanks all for your very swift replies.

sniper007 said:
Are you rich?

No! Which is kind of why I'm considering this "Linux RAID box" plan. Here's my thinking...

I require a RAID array capable of 400MBytes/sec and I'd like to use RAID6 because RAID5 isn't resiliant enough and RAID 10 uses too many disks. To get enough speed, I probably need about 12 disks.

Why build a dedicated Linux box to run my RAID? Several reasons:

1) I've heard that if you dedicate a computer to the task of running a RAID array then you can easily get away with "software RAID". i.e. I don't need to spend hundreds of pounds on an expensive dedicated RAID controller. In fact, I've heard it said that a Linux software RAID sollution can actually be faster than a dedicated RAID controller because modern CPUs are considerably faster than the APICs found on most dedicated RAID controllers.

2) I've got enough spare parts kicking around to mean that I can build a decent Linux box without having to spend very much money.

3) 12 disks are quite a lot to house. I could stick a RAID card in my WinXP workstation and have those 12 disks in an external disk enclosure but that starts to get expensive. I'd prefer to have a large (maybe 19" rack-mount) computer case with a mobo running Linux and a bunch of disks.

4) With a dedicated Linux box, I can also use that box for other tasks like tape backup and I can also teach Linux to do clever things like e-mail me whenever a disk kicks out a SMART error.

Thanks,
Jack

growse · 6 Mar 2007 at 11:03

dan_aka_jack said:
Hi,

Thanks all for your very swift replies.

No! Which is kind of why I'm considering this "Linux RAID box" plan. Here's my thinking...

I require a RAID array capable of 400MBytes/sec and I'd like to use RAID6 because RAID5 isn't resiliant enough and RAID 10 uses too many disks. To get enough speed, I probably need about 12 disks.

Why build a dedicated Linux box to run my RAID? Several reasons:

1) I've heard that if you dedicate a computer to the task of running a RAID array then you can easily get away with "software RAID". i.e. I don't need to spend hundreds of pounds on an expensive dedicated RAID controller. In fact, I've heard it said that a Linux software RAID sollution can actually be faster than a dedicated RAID controller because modern CPUs are considerably faster than the APICs found on most dedicated RAID controllers.

2) I've got enough spare parts kicking around to mean that I can build a decent Linux box without having to spend very much money.

3) 12 disks are quite a lot to house. I could stick a RAID card in my WinXP workstation and have those 12 disks in an external disk enclosure but that starts to get expensive. I'd prefer to have a large (maybe 19" rack-mount) computer case with a mobo running Linux and a bunch of disks.

4) With a dedicated Linux box, I can also use that box for other tasks like tape backup and I can also teach Linux to do clever things like e-mail me whenever a disk kicks out a SMART error.

Thanks,
Jack

Seems reasonable, but the transfer costs are going to kill you. There's no way you can build a network that can do 400MB/s on the cheap. 2 10GbE cards is probably going to be £1000 straight off, before even looking at the cost of fibre. A SAN may be cheaper, froogle has a 1Ch fibre HBA for about £350, but most of the cheap ones are only 2Gbit. Even then, you'll have to figure out how to set up the linux box as a SAN host.

The final option is to buy lots of Gbit ethernet cards and bond them all together. You might be able to get 2 4-port Intel Gbit cards, run 4 bits of cable between the boxes and bond them together giving you 4Gbit, but having never played with bonding I don't know if it works like that.

Any way you cut it, it's pricey

dan_aka_jack · 6 Mar 2007 at 11:21

growse said:
The final option is to buy lots of Gbit ethernet cards and bond them all together. You might be able to get 2 4-port Intel Gbit cards, run 4 bits of cable between the boxes and bond them together giving you 4Gbit, but having never played with bonding I don't know if it works like that.

That sounds like an interesting idea - thanks. Are there any open source / free / cheap bonding drivers for WinXP (or, even better, can it cope with bonding out-of-the-box)?

Also, I had a look at ATA-over-Ethernet but I couldn't find any WinXP software for doing ATA-over-Ethernet. Do any WinXP drivers exist?

Oh, I forgot to mention another big reason for looking at SAN: one day I might have more than one workstation. For example, I might have three computers running in my office: 1 linux RAID box, 1 WinXP workstation and 1 MacOS workstation. I'd like to let both workstations see the files on the Linux RAID box. Hence the thinking that ATA-over-Ethernet may be the way to go for me.

But maybe things are getting too complicated and I should just go for a RAID controller card for my WinXP workstation and forget all this ATA-over-Ethernet stuff!!

Many thanks,
Jack

edit here's a wikipedia page on bonding: http://en.wikipedia.org/wiki/Link_aggregation

growse · 6 Mar 2007 at 15:44

dan_aka_jack said:
That sounds like an interesting idea - thanks. Are there any open source / free / cheap bonding drivers for WinXP (or, even better, can it cope with bonding out-of-the-box)?

Also, I had a look at ATA-over-Ethernet but I couldn't find any WinXP software for doing ATA-over-Ethernet. Do any WinXP drivers exist?

Oh, I forgot to mention another big reason for looking at SAN: one day I might have more than one workstation. For example, I might have three computers running in my office: 1 linux RAID box, 1 WinXP workstation and 1 MacOS workstation. I'd like to let both workstations see the files on the Linux RAID box. Hence the thinking that ATA-over-Ethernet may be the way to go for me.

But maybe things are getting too complicated and I should just go for a RAID controller card for my WinXP workstation and forget all this ATA-over-Ethernet stuff!!

Many thanks,
Jack

edit here's a wikipedia page on bonding: http://en.wikipedia.org/wiki/Link_aggregation

Ok, so cards that support bonding should have that option available in their drivers. I know the HP/Intel ones I have support it. However, whether that supports bandwidth sharing or just failover I don't know. Equally, I don't know if it can split one IP conversation over multiple links, or if it shares by having multiple connections going over different links. If you're transferring lots of data over a single connection and bonding only load balances connections, it's not going to work.

ATA-Over-Ethernet is actually SCSI over ethernet and is called iSCSI. This allows you to do block-level disk access over a network. The client software is free (from microsoft) and you usually have to pay for the iSCSI target software, although there may be something free for linux. However, if you don't need a SAN specifically, a NAS might be better as iSCSI has a few more overheads compared to just plain NAS (windows file sharing + samba).

Also, bear in mind that a SAN only allows one client to be connected to a volume at any one time. Otherwise you'd have to synchronize changes across all clients connected to the volume. This can be done with a SAN filesystem, but these usually are $$$.

To me, it sounds like a SAN won't work, unless you go fibre-channel, and won't work if you want more than one person to access the data. I'd investigate the ethernet bonding route and see if you can get a 4GBit link using a pair of 4-port Gigabit cards.

Uhtred · 6 Mar 2007 at 16:25

As far as i know, with the HP ones i've used, they share the same IP doubling the bandwidth and so it isn't just a simple fallover

Tui · 6 Mar 2007 at 22:59

Bonding won't do what is wanted. Multiple channels are used to load share based on source, destination or both (MAC or IP address) and this determines which channel is used. Traffic between the two same points will always go down the same channel so the maximum throughput will be the speed of that channel.

dan_aka_jack · 7 Mar 2007 at 09:17

Hi guys,

Thanks loads for your replies. I must admit that I'm getting confused!

growse said:
ATA-Over-Ethernet is actually SCSI over ethernet and is called iSCSI. This allows you to do block-level disk access over a network.

As I understand it (and I certainly might be wrong), there are two SAN technologies: SCSI-over-Ethernet (iSCSI) and ATA-over-Ethernet (AoE).
http://en.wikipedia.org/wiki/ATA_over_Ethernet

I presume it would be ineffecient to use an iSCSI connection if the disks are SATA disks because you'd want to use the same protocol end-to-end hence AoE would be better than iSCSI if you're using ATA disks.

growse said:
The (iSCSI) client software is free (from microsoft)

Ah, cool - that's very interesting, thanks. Does MS do a free AoE client too?

growse said:
However, if you don't need a SAN specifically, a NAS might be better as iSCSI has a few more overheads compared to just plain NAS (windows file sharing + samba).

Again, I'm probably wrong but I thought the whole point of iSCSI or AoE was that it was more efficient (and hence faster) than a NAS? A NAS requires several layers including the IP stack, the OS filesystem at each end etc. A SAN does away with IP and, at the target end, the OS doesn't have to do much thinking at all.

growse said:
To me, it sounds like a SAN won't work, unless you go fibre-channel, and won't work if you want more than one person to access the data. I'd investigate the ethernet bonding route and see if you can get a 4GBit link using a pair of 4-port Gigabit cards.

Why would fibre-channel SAN work where as an Ethernet SAN won't? Is it because FC can do up to 4Gbps on a single link?

Tui said:
Bonding won't do what is wanted. Multiple channels are used to load share based on source, destination or both (MAC or IP address) and this determines which channel is used. Traffic between the two same points will always go down the same channel so the maximum throughput will be the speed of that channel.

Oh, bother. But now I really am confused... what you've just said sounds different to what the WikiPedia entry on Link Aggregation says:

"Link aggregation, or IEEE 802.3ad, is a computer networking term which describes using multiple Ethernet network cables/ports in parallel to increase the link speed beyond the limits of any one single cable or port, and to increase the redundancy for higher availability... Network interface cards (NICs) can also sometimes be trunked together to form network links beyond the speed of any one single NIC. For example, this allows a central file server to establish a 2-gigabit connection using two 1-gigabit NICs trunked together."

So, to take the WikiPedia example of a server with 2 x 1Gbps NICs... will the server only hit 2Gbps if there are at least 2 clients pulling data off the server?

Thanks loads for all your help,
Sorry for questioning the replies - I'm just trying to get a complete understanding of this technology.
Jack

growse · 7 Mar 2007 at 09:51

dan_aka_jack said:
*snipples*
Jack

I must correct, there is indeed ATA-over-ethernet, but I don't think I've ever seen it used seriously.

In the case of an iSCSI SAN, it doesn't really matter whether your drives are SATA, SCSI or floppy disk - the client will make/write a block-level request to the shared disk, thinking it's a locally attached disk. The iSCSI initiator then takes this request, turns it into a series of SCSI commands, wraps it up in IP (I believe, may even be TCP) and sends it to the host. The host then has to take the iSCSI packets, strip out the networking stuff and then apply the block-level commands to the host disk.

I said that there are overheads when it comes to iSCSI - I meant overheads in the sense that the CPU has to do a lot of work, if you're using normal network cards. Fibre-SANs are better for 2 reasons: 1) You're not wrapping it up in something like IP (or TCP), there's a much more lightweight wrapper as the whole thing is designed for moving disk commands / data down the wire as fast as possible. 2) Most Fibre-HBA's offload the processing from the CPU, so the packets that come down the fibre are deciphered by the HBA which can then talk to the disk with only minimal CPU involvement. You can get iSCSI HBA's that start to do some of this, but again they're $$$.

In my experience, putting a SAN on an ethernet-based network is always going to be a fudge, and never going to be as fast as a fibre-SAN. They're very useful things if you absolutely need a SAN and don't have much cash, but they don't perform that well. I've an iSCSI SAN at work which has it's own dedicated gigabit infrastructure, and my IO performance over it is nowhere near the local SCSI disk access (25MB/s vs 84MB/s).

The reason I mentioned the NAS is that it may be a better and more future-proof solution for what you're trying to do. Again, in my experience I can achieve reasonable transfer rates just using windows filesharing over a gigabit link, better than my iSCSI SAN. I'm not an expert, but I think that NAS -like transfer is more efficient, as it's simply wrapping the file data up in TCP/IP and moving it, rather than trying to wrap the additional disk accessing chatter as well. Also has the benefit of allowing multiple clients access to the same shared disk.

Personally, if I wanted performance, I wouldn't use iSCSI. If I had an application that absolutely needed a SAN (MSSQL database cluster) but couldn't afford fibre, then I'd use iSCSI. Everything else I find that fibre-channel or NAS fits the bill, depending on what you can afford.

dan_aka_jack · 7 Mar 2007 at 11:39

Hi!

Thanks loads for the quick reply - that's cleared up a lot of my questions, thank you.

Cool - I will look deeper into building a Linux NAS box rather than trying to do some sort of Ethernet SAN.

What sort of speeds do you get on your Gigabit NAS?

Thanks,
Jack

sniper007 · 7 Mar 2007 at 11:57

Interesting. Ive looked for a while at building a Linux box for Server tasks and storage using also Software RAID 5. However, the data I would be storing is critical (family photos I could NEVER lose) and I would have to be able to cater for good redundancy of all possible failures. Obviously the whole point in RAID 5 would be to allow for disk failure and be fault tollerant but there are other issues for me that need addressing. Heres some thoughts:

1: If you go the RAID Controller card route - say your Controller card failed 5 years down the line and you couldnt get the same type controoler card again? You technically could NEVER get your data back as you would need the same type of card to access the RAID 5 written data, as different cards have different writing algortihms to my knowledge. You could buy a backup controller card but that = ££££

So you go the software raid 5 route -

2: How easy is it + downtime for another disk to be chucked in when one has failed?

3: What other difficulties/knowledge would one have/need to go this route? I have no Linux knowledge. i recall reading a guy that did the same on google search, and he had a nightmare getting it all how he wanted.

4: Lets assume your Linux OS boot parition is on its own Hard disk outside of the Raid array (controlling the softwrae raid). What happens if this Hard disk drive fails? Can the data be recovered? What steps would one have to take to get data back up and running or recovered? Could one use Raid 1 for the Linux OS boot partition to avoid this?

Cheers

growse · 7 Mar 2007 at 12:12

dan_aka_jack said:
Hi!

Thanks loads for the quick reply - that's cleared up a lot of my questions, thank you.

Cool - I will look deeper into building a Linux NAS box rather than trying to do some sort of Ethernet SAN.

What sort of speeds do you get on your Gigabit NAS?

Thanks,
Jack

I've seen about 40-50MB/s doing file transfers at home. Not great, but better than iSCSI.

sniper007 said:
If you go the RAID Controller card route - say your Controller card failed 5 years down the line and you couldnt get the same type controoler card again? You technically could NEVER get your data back as you would need the same type of card to access the RAID 5 written data, as different cards have different writing algortihms to my knowledge. You could buy a backup controller card but that = ££££

You should never never never never rely only on RAID to keep your data. Always, always backup data to either a different medium (tape), or offsite (I keep mine on a server in a different continent), or both (if you can afford warehouse space

)

RAID is not really a data resiliency technology, it's a system resiliancy technology that allows the system to keep on operating even when a drive fails. It's really not about protecting storage, in my view.

daz · 7 Mar 2007 at 14:22

I think you will need to look at SCSI (well, SAS) to get anywhere near those speeds.

Tui · 7 Mar 2007 at 21:27

dan_aka_jack said:
So, to take the WikiPedia example of a server with 2 x 1Gbps NICs... will the server only hit 2Gbps if there are at least 2 clients pulling data off the server?

Yes. Cisco calls bonding Etherchannel:

Understanding Load Balancing

An EtherChannel balances the traffic load across the links in an EtherChannel by reducing part of the binary pattern formed from the addresses in the frame to a numerical value that selects one of the links in the channel.

EtherChannel load balancing can use MAC addresses or IP addresses. With a PFC2, EtherChannel load balancing can also use Layer 4 port numbers. EtherChannel load balancing can use either source or destination or both source and destination addresses or ports. The selected mode applies to all EtherChannels configured on the switch.

Use the option that provides the balance criteria with the greatest variety in your configuration. For example, if the traffic on an EtherChannel is going only to a single MAC address and you use the destination MAC address as the basis of EtherChannel load balancing, the EtherChannel always chooses the same link in the EtherChannel; using source addresses or IP addresses might result in better load balancing.