Hyper-v configs.

Banzai_Joe · 23 Mar 2012 at 12:27

Hi all,

i'd really appreciate expert advice on this as my current support provider don't seem to be coming up with the answers.

Scenario:

2 x HP Proliant ML370/G6 servers (S1 and S2) - each have 4 physical nics
All hosts and guests run WS2008 R2.

S1 = 32Gb ram, 2 x E5520 cpus, 1.8Tb raid 5 sas 10k disks (bios and drivers up-to-date)

Hyper-v Guests:
VDC - 4Gb static, 2 vcpus - domain controller
VPS - 4Gb static, 2 vcpus - print server

S2 = 68Gb ram, 2 x E5620 cpus, 2.8Tb raid 5 sas 10k disks (bios and drivers up-to-date)

Hyper-v guests:
VTS - 25Gb static, 4 vcpus - currently DYNAMIC disk - RDS (terminal server)
EXCHANGE - 20Gb static, 4 vcpus - currently DYNAMIC disk (Exchange 2007 sp3, ru6)
VIRTUE 8Gb static, 2 vpcus - currently DYNAMIC disk (DMS)
VWSUS - 4Gb static, 2 vcpus - currently DNAMIC disk (MS updates)

As each ML370 host has 4 physical nics, i've created a virtual lan for each (VL2, VL3, VL4)

On S2 only VIRTUE and VWSUS share one of these (as they are not as important as VTS and EXCHANGE).

The problem is that VTS users (RDS) experience screen freezes and sluggishness when working. Screen freezes often happen when accessing outlook 2010. When i check EXCHANGE the disk I/O is very high.
No one seems to know why.

Things i'm planning on doing:
Convert VTS and EXCHANGE from 'dynamic' to 'fixed' disks. The other two can stay as dynamic, unless suggested otherwise.

Now, currently ALL VL nics have "allow managment operating system to share this network adapter" ticked. Not sure whether it should be this way or not.

The host os uses its own physical nic.

So, does this sound like a reasonable setup, or are their obvious flaws in the config of these two host machines AND the guests on them.

We have 2 physical and 1 virtual DC/GC and replication across all 3 often fails, not sure if this makes a diff to the rest of the stuff above.

I know this is a long read, but i could really do with the help. It's also a decent learning curve for me if i can integrate any advice you guys have to offer.

Many thanks in advance.

lemonade · 23 Mar 2012 at 16:20

Virtue and Vwsus sounds like a bad disk config and possible bottleneck on the single NIC having data transfer from two Vm's.

oddjob62 · 23 Mar 2012 at 16:55

How many spindles do you have in that RAID 5 array on S2? Having Exchange DB and LOGs along with 5 OSs on the same array is most likely what's causing the sluggishness

Banzai_Joe · 23 Mar 2012 at 18:40

lemonade said:
Virtue and Vwsus sounds like a bad disk config and possible bottleneck on the single NIC having data transfer from two Vm's.

These are the very least problematic mate, as VWSUS only really hits the network in the wee hours when it pushes out updates to clients. And obviously at that time no one is on the network so Virtue isn't being used.

I'm happy with that setup.

oddjob, the whole array is made up of 6 x 300gb sas drives.
On the exchange VM, the transaction logs are on a separate vhd, but our I.T. support reckon that having them on the same disk should be fine. Although, someone else have made a similar comment to you on another forum.

Thanks for your input so far guys. Our network almost died today, so right now, i'm converting the exchange from dynamic to fixed and will see what happens. I'll do the same with our RDS, once i've finished virus scanning it.
I'll probably team up the nics as well.

lemonade · 23 Mar 2012 at 18:49

Banzai_Joe said:
These are the very least problematic mate, as VWSUS only really hits the network in the wee hours when it pushes out updates to clients. And obviously at that time no one is on the network so Virtue isn't being used.

I'm happy with that setup.

oddjob, the whole array is made up of 6 x 300gb sas drives.
On the exchange VM, the transaction logs are on a separate vhd, but our I.T. support reckon that having them on the same disk should be fine. Although, someone else have made a similar comment to you on another forum.

Thanks for your input so far guys. Our network almost died today, so right now, i'm converting the exchange from dynamic to fixed and will see what happens. I'll do the same with our RDS, once i've finished virus scanning it.
I'll probably team up the nics as well.

Ah fair enough, I'm doing a test setup of our VM's as were centralising wsus, central AV and print server and the initial plan is to use each NIC for each VM, i just used external mode for the test setup, worked well although the disk setup and array will be changing.

PAz · 23 Mar 2012 at 21:25

oddjob is bang on, this will 99% be disk config.

What are your current read/write/average queue length times?

I doubt changing from dynamic to fixed will have any performance impact - but certainly check against technet to see if using them is a supported configuration. I know for some MS products dynamic disks aren't recommended/considered supported configuration.

SoundsGood · 23 Mar 2012 at 21:49

You could migrate WSUS to server 1 to lighten the load a bit?

Banzai_Joe · 24 Mar 2012 at 01:23

SoundsGood said:
You could migrate WSUS to server 1 to lighten the load a bit?

I cant as S1 is pretty much full, storage-wise.
WSUS only really goes into action at 3am in the morning, so nothing esle is running anyway.

oddjob, would you mind exampling how you would set up or reconfigure according to the hardware i have please?

Banzai_Joe · 24 Mar 2012 at 09:13

Another thing, if i want to team the 4 physical nics on S1 as a test, can i team 2 up so essentially i'd have a 2gbit trunk for accessing data on the main host and then team the other 2 to give 2gb to split between the hyper-v guests?

masterk · 24 Mar 2012 at 18:12

Hi, i would suggest you have a cursory look at your switchports error and collision counters, and also check that speed and duplex have been negotiated on the switch as you would have expected.

Also just a pointer, be wary of active-active teaming NIC's on Hyper-V hosts, you need to make sure your switch is configured to lag the right ports together. Otherwise it will cripple network throughput (been there).

Banzai_Joe · 25 Mar 2012 at 10:31

masterk said:
Hi, i would suggest you have a cursory look at your switchports error and collision counters, and also check that speed and duplex have been negotiated on the switch as you would have expected.

Also just a pointer, be wary of active-active teaming NIC's on Hyper-V hosts, you need to make sure your switch is configured to lag the right ports together. Otherwise it will cripple network throughput (been there).

Interesting point masterk, could you expand on this please?
I ask because at xmas i installed 3 new Cisco SG-300 managed switches. I created LAG groups in these switches with 4 x 1gb trunks, to essentially give us a 4gb backbone. Now since that time all the above problems have occured, yet our I.T. suuport really dont think the switches are part of the problem. Due to the issues i reduced the LAG groups on the switches to 1 port each (so, 1gb backbone). It almost seemed like having all 4 ports active in the LAGs created a loopback, even though 'spanning tree' is supposed to eliminate loopback (?) (this is what i've been told) I have checked with Cisco and i have created the LAG groups correctly. But that's really the limit of my knowledge with managed switches.

Gonna read up about teaming the nics and see what i can come up with.

masterk · 30 Mar 2012 at 19:17

Hello sorry for the late reply. Are these port channels you are talking about between the three switches? I was just trying to say you'll need to create port channels for the NIC's you team on the Hyper-V hosts or you experience poor performance.

Have you ran some iperf tests?