ESXi 6.5 issue

Associate
Joined
21 Oct 2002
Posts
142
Location
Nottingham
I have used Esxi 5 before, so not new to it, got a new Dell T20 Xeon, 4gb Ram, I have 2 VMs on there using Esxi 6.5, both unbuntu 16.10 server.

One VM is nextcloud, the other runs iRedmail.

However, rather annoyingly, whichever VM was the last one to boot, will hang after half a day or so, the other one is unaffected. I have tried juggling the 4GB between them etc but it keeps happening.
 
Have you checked the VM log file located in its directory on the datastore, might give you some idea why its hanging.

VM tools installed?

How many vCPU and how much memory do you have assigned to each VM?
 
I have tools installed yes, I will see if I can dig out the log when home, its not working from work.

Each vm has access to all the cores, nextcloud has 2gb and mail 1gb ram leaving the rest to esxi, ive tried 2gb to each and other combos but make no difference
 
When the VM hangs, what's the total memory usage on the host? 1GB was enough for 5.5 with no passthrough however is this still applicable for 6.5? How much memory does the host use with no VM's on?
 
well it lasted a few hours then this in the log

2016-12-22T12:56:29.037Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
2016-12-22T12:57:30.024Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
2016-12-22T12:58:31.033Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
2016-12-22T12:59:32.032Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
2016-12-22T13:00:33.031Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
2016-12-22T13:39:53.641Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
2016-12-22T13:39:54.638Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
2016-12-22T13:39:55.890Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
2016-12-22T13:39:56.035Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
2016-12-22T13:39:56.126Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
2016-12-22T16:51:52.011Z| vcpu-1| I125: APIC THERMLVT write: 0x10000
2016-12-22T16:51:52.011Z| vcpu-3| I125: APIC THERMLVT write: 0x10000
2016-12-22T16:51:52.155Z| vcpu-2| I125: APIC THERMLVT write: 0x10000
2016-12-22T16:51:52.155Z| vcpu-0| I125: APIC THERMLVT write: 0x10000
2016-12-22T16:51:52.155Z| vcpu-0| I125: Vix: [162394 vmxCommands.c:7212]: VMAutomation_HandleCLIHLTEvent. Do nothing.
2016-12-22T16:51:52.155Z| vcpu-0| I125: MsgHint: msg.monitorevent.halt
2016-12-22T16:51:52.155Z| vcpu-0| I125+ The CPU has been disabled by the guest operating system. Power off or reset the virtual machine.

not sure what all the screenshot stuff is, that goes on for ages
 
Each vm has access to all the cores

This is a common mistake when building virtual environments and an easy mistake to make when moving from the physical environment to virtual, the way ESXi slices up CPU time means this is not optimal, if you only have two VMs you would be far better off assigning half the cores to each VM, even then this would be overkill.

As an example, if you have four real cores and create four VMs each with one vCPU they would all run together unhindered, if you then changed them to having two vCPU each, only two of your VMs can process at anyone time, meaning the other two have to wait for CPU time to become available. Now if you increase all four to have four vCPU only one VM can be operational at anyone time, this is not ideal, this is a simplified example but gives you a rough idea.
 
SupraWez is spot on.

If you were to look at the performance metrics for the VMs from the client and select CPU Ready as the only metric, you will probably see it is sky high (CPU Ready tells you how long the VM has been waiting to access the physical CPU's - as a general rule of thumb, anything real-time above 100ms would lead me to look at the VM CPU allocation).
 
Back
Top Bottom