Home Lab Threadripper Build Thread.

great case choice

Thanks dude... I should really wack some updates out on this. I am sure the world wants to know if ASRock ever sorted the Taichi for ESXi, even I want to know! I put the board back in and everything just to find out!

I still haven't tried mind. I ended up building a second TR rig for ESXi! Over this weekend I am making it my mission to answer the question of does it work. Fwiw I did end up adding a shed load of GPU's into the esxi lab and testing gpu passthrough on the x399-a. I had passthrough running on Vega, RX570, RX560, RX550 & RX480. So can confirm that this does work but there are some esxi config files that need tweaking to make it seamless.
 
An update if anyone is interested:
- The new build at the office is up and running like a charm with 64GB of RAM for now. The RAM price is pretty high so we might upgrade later (hoping to get the same model). The performance improvement is significant, we are talking about 300% speed increase in some tasks! My boss is happy.
- I decided to get an optane myself and test it. I just installed it last Friday and moved my Windows on it. Waiting to get an opportunity to boot ESXI on my USB key to see if it's at least detected. Windows performance is better but not that significant (upgrade from Plextor M5P).
 
I can help you get the gpu pass through working as there are some modifications you will need to make to some esxi files to achieve this without faffing around.

Sounds like the ASRock might still have some ahci issues but can't confirm as I'm yet to put my board back in. Probably a job for the weekend now.

Performance wise I am surprised, out of interest what ram and how fast are you running it on the 1950? Also if you could throw me the files I could try and help confirm your results and compare against the Asus board before I rebiluild.
Thanks dude... I should really wack some updates out on this. I am sure the world wants to know if ASRock ever sorted the Taichi for ESXi, even I want to know! I put the board back in and everything just to find out!

I still haven't tried mind. I ended up building a second TR rig for ESXi! Over this weekend I am making it my mission to answer the question of does it work. Fwiw I did end up adding a shed load of GPU's into the esxi lab and testing gpu passthrough on the x399-a. I had passthrough running on Vega, RX570, RX560, RX550 & RX480. So can confirm that this does work but there are some esxi config files that need tweaking to make it seamless.
Hey Vince, for getting passthrough for those cards to work, do you mind pointing out which ESXi config files require tweaks and how to tweak them? Thank you!
 
Hi guys ! How is it doing ? Are you going to try Threadripper 2 at 32 cores ? hehe !

Our machine has been really good at work and people are really appreciative because of the performance upgrade. It's been running steady 24/7 since April.

Last Friday 2 VMs crashed, one only crashed once and the other crashed multiple times. A day before I tested an imported VM that caused overcommitment to memory, so I stopped it. But the person that had many crashes reported another crash. So I did what I could within its VM, checked Windows power settings, tested RAM and hard drives, but no problem was found. I also checked with BlueScreenView, there was nothing.

I then moved its machine to another SSD. I'm currently waiting to see if other crashes occur.

Do you think the overcommitment caused those crashes ? The user reported that there was lag in Windows before the crash.

If the crash occurs again, our next step will be testing the memory with Memtest86 on the host and checking S.M.A.R.T. data on the SSD and hard drives. ESXI reports SMART temperature of around 70 celcius for the SSD, I don't think that's correct.

2018-09-04T14:56:27Z smartd: [warn] t10.ATA_____KINGSTON_SKC400S37512G__________________50026B768200FD6C____: above TEMPERATURE threshold (72 > 30)
2018-09-04T14:56:27Z smartd: smartmgt: plugin /usr/lib/vmware/smart_plugins/libsmartnvme.so is already loaded
2018-09-04T14:56:27Z smartd: smartmgt: plugin /usr/lib/vmware/smart_plugins/libsmartmicron.so is already loaded
2018-09-04T14:56:27Z smartd: libsmartsata: is_ata_smart_device:5 buf[82]:1 rc:0
2018-09-04T14:56:27Z smartd: libsmartsata: is_ata_smart_enabled ata fd:5 val:1
2018-09-04T14:56:27Z smartd: libsmartsata: ATA SMART device vid:ATA KINGSTON SKC400SA pid:KINGSTON SKC400SA
2018-09-04T14:56:27Z smartd: libsmartsata: closing fd:5
2018-09-04T14:56:27Z smartd: [warn] t10.ATA_____KINGSTON_SKC400S37512G__________________50026B768200FB54____: above TEMPERATURE threshold (72 > 30)

1536074360-capture.png
 
Last edited:
Some SSDs throttle at 70°C so you'll see poor transfer speeds when copying files. I had an M.2 SSD do this when testing it, hanging in open air.

I also learned that NAND memory works best when warm so if you apply any heatsinks to the SSD/s, just apply them to the controller chip and not the flash.
 
I really doubt those SSD that are lightly used will get to 70 degrees. We'll first check the S.M.A.R.T. data in BIOS and if it says the same, we will use a finger to check. If they are indeed hot we will add a fan as the ambient temperature is already controlled in our server room.
 
I, too, can confirm that my X399 ASRock Taichi is working perfectly with ESXi 6.7 !!! I was on BIOS 2.30 and ESXi was stuck booting, but after flashing to 3.30 BIOS and connected my original hard drives untouched.... it worked! I had an Intel motherboard/cpu die on me this past weekend in my ESXi system and decided to cannibalize my TR system to get it up and running ASAP. This was my original plan when I had gotten my TR hardware. But I have my FreeNAS, Plex, and OpenVPN working again! Now, what do I do going forward? I have lots of options. I think I might test GPU pass-through and see if I can game on this. I've been watching this thread for months.
 
Can also confirm 6.7 works on my Taichi :) Big thumbs up to the lads at vmware / asrock for finally getting this sorted! I haven't had much in the way of esxi work recently but when my next bit comes along it's gonna be really nice not having to swap boards in and out.
 
This may be the wrong thread: So, now that we are happy with X399 boards and all with ESXi 6.7, what do we do about hardware health? What can I do to monitor CPU temp? Motherboard temp? Hard drive health? SSD/M2 health? Do we go old-school and use put in temp sensors and lcd screens? Or HBA cards and pass them thru to the VM and then monitor them there? Maybe periodically shutdown ESXi and run SMART utilties on the hard drives via usb thumb drive. I know enterprise motherboards have built-in monitoring hardware that ESXi can read. But, we homelab users that are using this to teach ourselves need to maintain these systems for our families.
 
This may be the wrong thread: So, now that we are happy with X399 boards and all with ESXi 6.7, what do we do about hardware health? What can I do to monitor CPU temp? Motherboard temp? Hard drive health? SSD/M2 health? Do we go old-school and use put in temp sensors and lcd screens? Or HBA cards and pass them thru to the VM and then monitor them there? Maybe periodically shutdown ESXi and run SMART utilties on the hard drives via usb thumb drive. I know enterprise motherboards have built-in monitoring hardware that ESXi can read. But, we homelab users that are using this to teach ourselves need to maintain these systems for our families.

If ESXi is working properly then all of those things should already be monitored, certainly on HP hardware you can see all that. If you load the extra HP packages you can also see storage controller status including raid and batteries etc.
 
If ESXi is working properly then all of those things should already be monitored, certainly on HP hardware you can see all that. If you load the extra HP packages you can also see storage controller status including raid and batteries etc.

This is true but the monitoring is not fully supported outside of pretty much anything that isn't oem hardware. I have never had an unsupported board that properly reports on the hardware in ESXi. Fully working in respect of an unsupported lab is everything apart from monitoring imo. so long as sata and all the other hardware is available to the environment then for a lab that should be enough, in production really id be sticking with HP DL385 gen 10 etc and enough of them for HA and DRS, I mean that's just sensible if you are running in anything which would be classed as medium or above in terms of business size, I was going to buy the gen 10's this year but with 7nm epyc so close i'm now waiting on that. You can bet all the money that when hp release those new 385's that ill be all over them :)
 
Hi !
Thanks guys for your posts.

Did you test functionality like Fault Tollerance on this AMD CPU ?
Did you test nested virtualization (esxi 6.7 in esxi and esxi 6.7 in VMware Workstation ) ?

I'm also going to build home lab (nested virtualization) for vSphere.
I want to install Win10 (as bare metal OS) than VMware Workstation (14 or 15) - and there I want to have few esxi 6.7 - simulation of two sites so like 6 x nested esxi, 1 x vCenter + 1 vCenter HA {3xvCSA}, 1VM as AD(domain controller), 1VM as Storage (iSCSI + NFS), 1VM as backup system (maybe Veeam), 1VM as monitoring (maybe Veeam One or Splunk or nagios/zabbix or ???? - I want test it all ;-) )
Anyway - I will need at least 128 GB of RAM.

I was thinking about Intel Core i7-7820X but it is too expensive right now so I start thinking about AMD Threadripper.
Can you give me any suggestion ?

Thanks in advance.
 
Back
Top Bottom