AD Slow Performance with VMware (and some Procurve questions thrown in)

Associate
Joined
24 Jun 2007
Posts
1,869
Location
Landan.
Hi all,

I work for a company who deploys networks to small companies. Because of the cost involved, we're trying to avoid the expense of an ESX license, and so we've been using VMware Server 1 and 2 to run AD servers on top of PowerEdge 2900s.

However, at the past couple of sites we've installed - the general performance of the workstations used throughout the organisations has been dire. I know we should be using ESXi since its general release, but there's a lot of work involved with planning and changing scripts etc so we're still in testing.

In the meantime we're stuck with Server 1 & 2, and the dire performance as a consequence. It seems to be a random problem - at different times during the day the whole network (i.e. the workstations connected to the domain) will grind to a halt, the computers will lock up when trying to save files etc.

Has anybody any experience with using VMware Server in a production environment? Or indeed has anybody experience with running AD controllers in a virtualised environment?

Also - I don't know if this is related: but generally at most of the sites we deploy two switches, one in the server rack and one in a comms cabinet. They're Procurve 1800/2510/2610/2810's - so not cheap kit (in context of course) - and we tend to trunk two (gigabit) ports on each to act as the link between them. However, am I right in thinking that a trunk on a Procurve is not the same as a trunk on a Cisco? And rather LACP on a Procurve is equivalent to a Cisco trunk? So are we setting the switches up wrong?

Any help or ideas (or sympathy :D) is greatly appreciated.
 
Sorry yea it is free - that's what I meant by general release. We've been testing it since it was released but are still a couple of weeks off being able to roll it out.
 
what else are you running on the 2900's natively? and what else are you running on the 2900's in vmware? what cpu, memory, and disk are you running on the 2900's? and what network mode are you running in vmware.

in terms of your switch questions... fairly sure cisco switches support both lacp and etherchannel, the latter of which i believe to be proprietary, for when you want to bond links together. as far as i am aware the procurves support lacp. assuming your doing procurve <-> procurve then lacp should be fine.

i think you can set them up in two ways though? so you either have an active/standby channel with failover for redundancy, or an active/active channel for performance and redundancy... so, you might want to look at how they are setup?

Thanks for the response :)

The 2900s are Quad Core Xeons, one of the 2900s in question has 16GB of RAM, 2 x 400GB SAS and 2 x 750GB SATA. And at the same site we have two Procurve switches, with two trunked (not LACP'd) links. One switch is all Gigabit for the servers, the other has 2x Gbit and 10/100 for the rest.

Staying with the same server, there's 3 VMs - 2 Windows 2003 member servers (running a pretty dormant SQL server, and the other nothing as of yet), and then the last VM is the AD DC in question.

We made some changes to VMwares config yesterday that seemed to make a great improvement to the performance on the server itself. We configured it so that VMware no longer used the vmem file and instead just used the allocated memory - with a shm partition available to it if it needed. CPU use went from 80%-100% avg down to 10%-50% - but unfortunately the change hasn't been visible to the end-users.

I have been thinking about the fact that all three VMs are on the same SAS RAID 1 array, one of my colleagues has suggested we could try adding a couple more disks and splitting them across the arrays. But surely SAS is good enough to handle one server that's reasonably used and two others that are pretty much laying dormant?

The googling continues :D

EDIT: Sorry forgot to address your points about the Procurves - we're not using LACP for the Procurve <-> Procurve connection, just a straight trunk. Does LACP offer anything else other than redundancy?
 
Last edited:
sounds like your vm boxes are reasonably pokey, wouldn't do any harm to split the vm's out onto their own disks though. can you clarify what you mean by 'just a straight trunk'? have you literally just cabled the switches together twice? if so, are you running spanning tree on the switches, otherwise you will have a loop?!

Sorry - yea it's definitely not creating a loop - the switches haven't died a death :D

With the ProCurves - you can either make Trunks or LACP Trunks. Afaik the LACP trunks are as you say: they provide redundancy and also allow you to bond the links like a normal trunk giving you 2Gbps and redundancy. I think the 'Trunk' option just turns the two ports in to an Uplink/Downlink that dies if one of them is removed.

/helplessly scouring the VMware forums
 
need to know how much ram in the host, and how much given to each guests, is this just one customer or a general problem with a few ?

Is it the same time each day ? What is the OS on the host ? Are you running VMware server here ?

This seems to be a general problem with three or four sites/customers, all similar setup. And the slowdowns occur during the day, so under relatively light load with all the users logged on to the domain and carrying out day-to-day operations. There's only 20 users at most at each site so I would have thought the setup we've given them would be more than capable. Of course, I could be wrong :)

I'll give two examples of the sites in questions:

Site A:
Host: Quad core Xeon, 16GB RAM, 2x SAS 400GB, 2 x SATA 750GB, VMware Server 1
VMs: 1 CPU, 3.6GB RAM
VM1: Windows 2003 SP2, AD DC, Exchange 2007 (15-20 mailboxes)
VM2: Windows 2003 SP2, SQL Server
VM3: Windows 2003 SP2

Site B:
Host: Quad core Xeon, 8GB RAM, 4 x SAS 400GB, VMware Server 2
VMs: 2 CPUs, 3.6GB RAM
VM1: Windows 2003 SP2, AD DC, Exchange 2007 (10-15 mailboxes)
VM2: Windows 2003 Terminal Server - not yet powered on

With Site A, I'm soon to be upgrading to VMware Server 2 and giving VM1 8GB (maybe more?) of RAM.

I know some might say it's a lot to be running an AD DC and Exchange on a VM with 3.6GB of RAM, but we've modded the VMware Server configuration as per this article and so there should be no disk swapping going on - and within the VMs they're only using half of their allocated RAM. I'm confused as to what's causing the problem :(

Thanks everyone for all your ideas and feedback
 
Back
Top Bottom