Netapp Redundancy

anything I don't mind · 9 Jul 2013 at 16:04

My colleague and myself have been setting up a dual controller 2240 this past week with three esxi hp dl385. We also have two cisco 2960 for use as storage only switches.

We have set up the ISCSi and tested redundancy with turning off a switch and also with doing a controller failover and that works ok. We are having an issue with NFS though.

The two switches are not connected to each other and we were not sure on whether we could use multi or single link configuration.

We have created a vif with two interfaces per controller then each interface goes in to one switch. It works ok, but we are getting "Link Monitoring Logic Failed " on the nfs interfaces on the cli. When we do a controller failover the nfs does not failover as would be expected. Someone we know suggested that they thought the switches would have to be connected to each other. Currently the interface is on a single mode configuration, as opposed to multi or lacp. We have not tried multi yet and don't think lacp will work because of two switches being in use.

Are we doing it all wrong? Might be difficult to help without seeing the netapp configuration. But if you have any advice on using this sort of configuration ill appreciate it.

edit: ok the nfs issue is resolved. It was that the two switches needed connecting together because the netapp intefaces need to be able to communicate with each other and if they can't then it causes the issues we were having.

Another question I have is it recommended to put nfs and iscsi on to their own vlans, even if there is isolated storage switches and each protocol has its own interfaces ?

DustyMiller · 9 Jul 2013 at 21:24

Officially they should be on their own separate vlans, or as a minimum their own subnet, purely in the event of any broadcast traffic allegedly. I cant see any mind you.

Our Netapp's are on the same subnet for iscsi as well as data. for the small amount of clients connecting, add to that the fact they have LACP connections across a stacked Cisco switch, plus I couldn't justify the extra purchase of additional dedicated storage switches (this year) plus I wanted 10g switches for our Vmotion traffic anyway.

anything I don't mind · 9 Jul 2013 at 22:31

They are on their own subnet.

iscsi on 10.10.2.152-4
nfs on 10.10.3.152-4

The reason we realy had to get storage switches as currently they have cisco stacks with no free ports and only 10% gigabit ports. We couldn't justify to client to replace all the switches at this point. Plus I think it is better to have storage isolated.

We are going to put them on their own vlan as that is what everyone seem to recommend but we also couldn't logically justify it as a requirement as it worked without it.

Quite annoying how the oncommand gui does not allow you to delete vif or vlan configurations. Also if you use the cli that doesn't save the configuration and conflicts with the oncommand. The only way is to write out the full /etc/rc config file in one go. Seems far too more hassle than it should be, especially for 30k. comes with an a3 size manual with massive instruction on it, almost a bit of a joke.

The takeover is pretty quick though. Hopefully the client sees some improvement in performance a currently on netapp 2040 with nfs and iscsi going over one gigabit only with one gigabit in standby mode to three esxi. Compared to 2xgigabit for nfs and 2 for iscsi and new esxi hardware.

Sucks when the client spends £45k on hardware and claims they don't notice any difference...

Thanks for response.

DustyMiller · 10 Jul 2013 at 22:29

You can use link aggregation from the esxi for iscsi traffic, have each nic across two physical switches, and into the same vswitch and have ip hash routing (off the top of my head). that will work and give you your resilience.

However you cant do the same for NFS or Data, as there needs to be a single destination for the traffic to aim at going into your hosts, and that is not possible with two disparate switches, even if they are joined at the copper / fibre level, they are still two totally separate entities.

also what it the network length of the iscsi & nfs networks.

We have spent £100K on our new NetApps, and the users don't notice any difference, but I do. The new things fly. Especially with the SSDs in for the Esxi hosts - they love SSD's

anything I don't mind · 11 Jul 2013 at 16:55

Do you use the ssd for cache only? We run our esxi off sd cards and only use shared storage. We will only be using NFS for vm data store and will have iscsi going to the old exchange box through windows, because that is the current configuration of exchange and when we go to 2010 we will move it to nfs datastore. We just added iscsi to esxi for the purpose of seeing if we could get it working.

We have managed to get NFS working with two nics and two switches. one vswitch and two physical nics, both active active, each nic goes in to each switch. WE have tested by turning off a switch and by doing a netapp failover and the datastore stays up. It was not working but what we had to do was connect the storage switches together and then trunk the ports so that they accepted all vlan traffic between the switches.

DRZ · 12 Jul 2013 at 04:44

I'm once again dazzled by you, groen.

You've arrived at "a solution", which is great - but do you know why you had to do what you did? Do you know what you would do differently when ordering kit for a deployment like this again?

There are too many mistakes in your posts in here to correct but I really hope you have a think about it.

DustyMiller · 12 Jul 2013 at 10:42

The SSD are in a flash pool, so that the hot datasets from the array are stored in Flash, as well as on the array. we only have @ 300Gb of useable flash storage, but for the ESXi hosts its lightning quick, as they are all so far Server 2008 R2. Plus the LUN is de-duped.

anything I don't mind · 17 Jul 2013 at 14:01

DRZ said:
I'm once again dazzled by you, groen.

You've arrived at "a solution", which is great - but do you know why you had to do what you did? Do you know what you would do differently when ordering kit for a deployment like this again?

There are too many mistakes in your posts in here to correct but I really hope you have a think about it.

We can't seem to get iscsi redundancy to work.

I didn't spec out this kit, i suggested the switches, but initially i suggested going FC but no budget for it. I am not sure what you are referring to that we did wrong, if you could please let me know that would be great.

Iscsi redundancy just dose not want too work. We need to setup iscsi from a windows vm to netapp. We can get one interface on the netapp going through one switch to ping from the windows vm. But no matter what we do we can't get the other netapp interface too work.

We both are kind of fed up with it as we have tried everything we can, don't know why its not working. I am going to ask some vmware guys i know if they can help.

edit: got iscsi multipath working in windows X) in the end. just needed a second nic and a second port group and the multipath config. But now we have to work out why we haev high latency on the iscsi paths.