.
I have moved some of my Linux VMs to SSD and they have been fine for a while. They are setup like this;
Minecraft Server (CentOS) - Vertex II VM Datastore (60GB)
SABnzb OS (CentOS) - Vertex II VM Datastore (60GB)
SABnzb data area - WD Scorpio Black (500GB)
vSphere Swap areas - Vertex II Swap (60GB)
For ease of setup the SABnzb data area drive is mounted on /home
Yesterday I noticed a number of errors from SABnzb which is quite unusual. It had been a while so I rebooted and after quite a while it came back up but was very slow with pauses of a couple of minutes evey minute or so. On checking the log (/var/log/messages) I noticed mention of a filesystem check failing on the SABnzb data area drive. I unmounted the drive and ran fsck.ext4 on it which took a very long time, especially with the pauses and reported a number of file system errors. The pauses still persisited. I then removed the drive from the VM (but not the server) and added another new virtual drive on a different datastore (SAN) but this made no difference.
My next thought was that the Vertex II they are installed on may be having issues so I tried to start the Minecraft Server which had been down and it took a long time to start (starting VMs were getting stuck at 95% for quite a while). Once it was up I have problems accessing via putty and console.
I then created a new VM on a spare SSD I have around but not used yet (Agility 3 120GB). As the install ISO was on the Vertex II Swap drive the install was lovely and fast but I was still seeing pauses even on this VM.
I moved the swap off of the Vertex II Swap on to a WD Scorpio 320GB I had spare and had a look at the vSphere server logs from the console. First thing I noticed was that the last entries in all the logs was 1st Nov. I do not know whether this is a time config issue on the vSphere server or a logging issue. I will check tonight. I could see a number of heartbeat timeouts for the SSDs (all three) repeating over and over again. I rebooted the vSphere server but still had the pausing on the VMs for the most part. Putty to the new SABnzb appeared fine though. The new SABnzb data area was put on my SAN so I can remove the Scorpio Black 500GB for a full health check in my desktop machine. The Minecraft and old SABnzb VMs are left down.
I believe the pauses are related to the SSD timeouts. All my drives are connected to a IBM M1015 (flashed to LSI 9211-8i IT firmware) as it is a SATA III controller and the motherboard only has 2x SATA III connections. The SSDs are split over two different SAS cables to the controller. I also have another Agility 3 120GB which is not showing up on the vSphere server at all. Possibly a dead drive but have not checked yet. It has been there for some time and last night was the first I had seen an issue.
The vSphere server is only used for my own home mini servers and testing since I have moved my business stuff to a dedicated HP ML110 G7. I can, therefore, pull drives etc without too many issues.
I do have some Intel 520 120GB SSDs around but they are new stock so would rather not use them as I would then have to pay for them and I would rather not have the business buy them for "own use" right now

.
Any suggestions for narrowing down the problem ??
RB