Home Lab Threadripper Build Thread.

Man of Honour
Joined
30 Oct 2003
Posts
13,635
Location
Essex
I was watching the release of threadripper carefully and the core count got me thinking about how I could use Threadripper to replicate parts of my works virtual infrastructure in order to plan out projects more effectively and simulate infrastructure and software upgrades outside of the building, which frankly isn’t really possible on the ageing 1090T system I have at home now. To give some perspective I guess we should start with what I do for work in my day job which is basically to head up all technical projects for an Intellectual Property Legal500 firm. I have been at this job now for 5 years which is a long time and in that time we have pretty much had a top to bottom rebuild and our infrastructure is better than ever, I’ll talk a little bit about what I manage before we get to the build.

Without going into too much detail split between 2 sites we have 6 DL380 G8’s, 5 of which are esxi hosts each rocking 192gb of memory and 2x6 Core HT Xeons, a huge (for the size of the company) HP6300 EVA with 5 trays of 600gb SAS disks and two HP store once 2400’s. This is all connected up with fibre and powered by a 30k HP UPS installation that weighs in at 2 metric tonnes. I won’t touch too much on switching or security but our core switches are layer 3 POE cisco units and we run mainly FortiGates on the perimeter.

It’s pretty clear that x399 isn’t going to get anywhere near the performance metrics that I am getting out of the infrastructure above but the core count coupled with the I/O should be able to offer me the ability to effectively replicate subsets of it while also giving some insight into projected epyc performance in for example SQL,IDOL,SharePoint,iManage and other similar corporate workloads.

This build from top to bottom will disappoint some but hopefully interest others and I might even be able to answer some questions along the way. I’m not totally against giving some gaming benchmarks and a little overclocking but that certainly will not be the focus of this thread so let’s not steer it that way too much please gents. To go alongside what I am doing with this kit I have convinced a buddy of mine (didn’t take much convincing tbh) to come over and experiment with some of his creative workflows which he tells me involves the following: Lightroom 6, LRTimelapsev5.7, aftereffects 2015 cc, premiere pro 2015 cc, Gopro Cineform decoders, Avid codecs etc. I will also share his thoughts or perhaps even get him to chime in.

It’s about time I stopped waffling on so let’s get started, the kit list I have put together is as follows, it’s not all here yet but I have received the first bit today which is the PSU and ordered the rest so thought I would get started putting something together to document it on here.


The Machine:

Gigabyte AMD X399 AORUS GAMING 7 Gaming Motherboard Edit: AsRock Taichi x399
Edit: Asus Prime

AMD RYZEN Threadripper 1950X 16 Core TR4 Processor

2X G.Skill Trident Z RGB 32GB Kit DDR4 3466MHz RAM

Corsair Hydro Series H100i v2 Extreme Performance Liquid CPU Cooler

Crystal Series 460X RGB Compact ATX Mid-Tower Case (hope this is going to be big enough)

Seasonic 860W Platinum Power Supply

A solitary XFX RX480 8GB

256GB Samsung Pro NVME

4x1tb, 1x8tb WD RED, 3 or 4 random SSD’s probably in raid 0

Netgear ReadyNas Serving as extra LUN storage (perhaps snapshots, haven’t decided yet)

A USB stick for ESXI.

Fortinet Fortigate 30E (because I am poor and can’t afford the top end kit)

Cyberpower Rack Mount LCD Series PR3000 UPS.


I guess that’s really all I have for now. I will update as I start building and as I receive more kit which looks to be in the next few days.
 
Last edited:
In my experience, you'd be better off buying a matched 64GB kit than using 2x 32GB kits. Good spec though :)

I haven't yet been burned by this yet and my last two sets have been non matched. Out of interest whats the difference as years back I bought a matched XMS 3 which just came in 2 packs in much the same way
 
The difference is that you're mixing kits, versus not mixing them. Mixing memory will have varied results depending on the platform, speed and density of the memory kits. Memory is binned at the timings and speed by the vendors at the density it is sold in. In combining kits you are eating into the guardband put in place by the memory vendor. Moreover, you're putting more strain on the IMC. 64GB @ 3466 on Thread Ripper is not likely to be plug and play, especially with combined kits. I'd fully expect to have to tune things manually, and even then I would probably expect to make some concessions.

Fair play, seems reasonable. I have to be honest I don't expect plug and play and to be fair I originally I ordered a single 32gb kit but when I checked the order online they had added two, rather than argue I just went with it, because well, more is better right? I also figured that rather than just buy 3200, if I go a bit faster I might have a better chance of hitting 3200 at decent latency, not necessarily to hit advertised speed. I can asses the situation and then buy more if need be I guess.
 
Sounds good, as long as there's realistic expectations going in that's all that matters!

Absolutely and that is exactly what I tell the wife :D.

My order is still not shipped :mad:. The board is assigned to my order but it appears its not left yet, hopefully it gets shipped tomorrow and we can get this show on the road. I'll throw a couple of photos of the PSU, Firewall & UPS up later tonight or tomorrow. With the firewall, other than routing, IDS, Antivirus and all the other good stuff a Fortigate can do I think I might have a play setting up an IPSEC tunnel back to the office passing h.323 so that I can set up an avaya ip phone or avaya sip DECT at home. That has to be a better solution than forwarding calls in to my mobile from the office.
 
Do Forinet not do a VM appliance you could have used to keep the cost down?

The do and it is something I was going to look closer into. The 30E for me was about baseline for what I needed and because I am going to switch between windows and ESXi I thought that the always up nature of a hardware firewall would probably suite my needs better than the virtual appliance. I paid about £330 for the 30E and that's about as far as I could stretch. The plus side as I see it is that the 30E runs pretty much an identical OS to the more expensive ent grade kit.

Likewise... Understand Ryzen now works (after patch) but on Linux KVM Ryzen has Nested Page Table problems and passthrough is not reliable/consistent so very interested to hear how TR works under ESXi.

Also interested in how that mobo splits devices into IOMMU groups both with and without an ACS Patch... so if you can boot direct into a Linux kernel and do some dumps (rather than into ESXi and then Linux) I'd be grateful...

I am not going to lie some of this is straight over my head but I am more than happy to spend a little time learning myself and trying to provide you with some insight into this. As I start getting into the software side of the build I will make a list of the stuff people would like to know and hopefully with your help can answer some of the more complex questions :)

I'll get this out the way now....

You do not need more than 8 GB's of RAM.... :p

Only 8gb! - That might just be enough for one under provisioned SQL server :) Seriously I was actually thinking that the 64gb might be just a little short. Have you seen the price of memory these days? Its obscene.
 
You plan on chucking all the HD's in the same box?
snip...
I personally like the all in one box approach, small foot print, multi purpose etc, but did you consider scaling out rather than up approach?

The plan is one box and storage on lan to back it up. The fact that the case has little room for hard disks was a school boy on my part but also largely down to my limited budget which went on the bits that really count, some getting creative and perhaps 3D printing of some case internals may be required. If need be ill buy a new case later down the line when I have more cash available. In terms of scaling the companies infrastructure, because of the nature of some of the data we keep we cannot use cloud services, we do however have a co-lo of our own kit in a secure data center. Put simply we need to design secure, robust infrastructure that scales well with the companies requirements.

How do you currently replicate data (assuming that's what you do?) between to your two sites and do you plan on emulating that in your lab?

We currently use dedicated HP StoreOnce 2400 appliances, they are basically dl380's running custom software with dedicated de-dupe, compression and cross site replication, used alongside Veeam they are very neat devices. I don't think there are any issues with sharing links as OCUK don't operate in this space so take a look here: https://www.hpe.com/uk/en/storage/storeonce.html any issues dons then please feel free to remove. This isn't really something I plan on replicating in the lab to be honest, firstly because the hardware is on the expensive side and secondly because I don't really need to replicate the whole environment to achieve the majority of my short term goals with this project. Some of the first things on my list bar getting esxi up and running would probably be an in place upgrade of legacy systems that are in desperate need of a revamp, we only have one of these which is our QlikView server and dashboard which is a dedicated box and not virtualised for some unknown reason, so that will have to be done straight away. Then I plan on simulating an iManage 9.3 to 10 + web upgrade and see what parts of our custom code fall to pieces.
 
I bought a new monitor for the build last night, nothing special as I'm running low on available funds but it is a 4k 28" freesync jobbie which I thought represented decent ish value for money. Should be better than my current 3x 1080p Samsung screens and I might just add another 2 over time. Also got my eyes open for another rx480 but what with the mining stuff going on they do not represent value for money right now so I may just hold on until a second card pops up at reasonable prices. The other option is vega 56 or something from the green team but for now I think that I am happy enough with what I have even if rx480 doesn't really stand a chance at these resolutions.
 
Given that updates are slow because of the motherboard stock situation I thought I would put a couple of teasers up of what is here so far :)









 
Last edited:
Right we will have plenty of updates later today lads, we are built, looking awesome and are completely stable. Currently running just windows on nvme. TR Base closk set at 3.7 (two cores still seem to boost to 4+), ram currently at 3066. Pics and some details to come today. FWIW Threadripper is an absolute monster!
 
native windows atm. Im building parts at a time so windows first and getting it dialled in. At current settings it does the Ryzen Blender job in just over 11 seconds.



This is a 3.7 base with 4.05 boost on 2 cores. voltages somewhere around 1.4 and temps are really nice still. Maxing out at around 65 under load. Memory at 3066 atm but this was with only about an hour worth of tweeking
 
Last edited:
Right time to update with a few pics of the build, right now I am only running with the NVME drive:

























 
Last edited:
I've hooked up an 8tb Seagate ST8000AS0002 and have transferred a couple of old 1tb mechanicals out of my old machine and in the process broke an SSD that made up part of my old raid 0 array.

The current drive config which I will change again once i've transferred some stuff around is:

Samsung pro 256 nvme
1x 8tb Seagate ST8000AS0002
2x Samsung Spinpoint F1's 1tb
1x Kingston V300 120 SSD

4x1tb's In the Netgear ReadyNas p400

The final one will be roughly the same removing the v300 120 and adding 3x Hypertech Firestorm 240's plus a further mixed Kingston v300 480 & Crucial M500 480 raid 0 array. I'm going to have to design and 3D print a little cradle that can sit in the bottom of the case to house the extra ssd's but that should give me roughly 14tb's of accessible drive space to play with but with varying performance so ill have to work out what will work best on what storage in terms of exposing storage to esxi.
 
nice looking rig mate.... its nice to see threadripper builds so quickly

I'm not so sure about quickly i'm still trying to dial in the memory and cpu. I would love to see 4ghz all cores and memory up at 3200+ with tight timings but I am not sure how achievable that is with what I have. There are so many options for clocking and tuning that im struggling with what does what. The last time I really overclocked anything was about 6 or 7 years ago when I was setting up my old rig so its a bit of a learning curve to be honest, I might have a look over some of the ryzen threads and see what people are doing with the 1700 chips. I sat there randomly poking it and staring at bios screen until about 2am last night so have decided to just stick with a nice stable middle ground for now.

Tomorrow I think ill start seeing how it like esxi :)
 
I properly broke it last night while tweeking (read poking about in bios without a clue). Enough that I nuked windows good and proper so today I'm back on windows duty.

Edit: Windows all built again and the majority of software installed. Decided to clone windows onto that spare v300 ssd just in case I manage to break it all again.
 
Last edited:
Vince, gave up waiting on the Aorus X399 then?

I did mate, got bored of waiting for the Aorus and to be fair would have preferred the taichi from the beginning.

Back to the build, perhaps unsurprisingly I'm having some esxi issues, I got around my first load as fast boot was causing problems but now I have an all new issue. When loading esxi the system seems to hang here:



I've tried disabling all the sata ports on the board and as many other built in devices as I can but as yet no luck. I gave up late into the night and decided that this is one to troubleshoot this evening if anybody has any advice or ideas. For now I have workstation 12 pro which isn't really the ideal but will do while I work out what's causing this issue.
 
Last edited:
This might not be of any use but I had a massive numbers of problems with the native AHCI drivers in version ESXi 6.5.0 on a HP Microserver. Apparently these are fixed in 6.5.1, http://www.virtuallyghetto.com/2017...ance-issue-resolved-in-esxi-6-5-update-1.html. Of course, it could be a different issue as the X399 boards are so new.

Cheers for this. I checked and I am running 6.5u1.

The file name/build details are as follows: VMware-VMvisor-Installer-6.5.0.update01-5969303.x86_64.iso
Sadly this is the same build number as above. On top of this I tried the "Driver Rollup Iso" : VMware-ESXi-6.5U1-RollupISO.iso - Again this fella hangs at the same point. I'm wondering if its getting past the AHCI or not, not sure quite how to troubleshoot this one but steps tonight are going to be:

- Disable USB3
- Disable what network adapters I can
- Build my own iso with drivers injected.

Basically all I can think of is poking it till it works. It's clear there is something it doesn't like now I need to find out what. The biggest problem I have is the lack of relevant information out there. Googling threadripper and ESXI as you can imagine doesn't return that many results.
 
When people were trying Ryzen R7 with ESXi it was also failing, not sure on the exact error but disabling SMT resolved the issue and allowed it to boot.

Obviously disabling SMT is not ideal but worth checking.

On the AMD slides during the build up and release did they have VMWare listed as a partner, maybe the official patches will be released with EPYC.

The SMT issue with Ryzen 7 was related to an apparent 15 core hard limit in ESXI which I think was patched about a month ago. Disabling SMT kept the core count down allowing things to boot.

I Did try SMT last night with little luck. I do however wonder what happens if I disable a ccx essentially turning it into an 1800x. I'll give that a try as well. Life would be much easier if it threw an error but instead it just sits there. I do wonder if it could be something all together more simple like ESXI trying to pass vga, in turn crashing what I can see but perhaps working still behind the scenes. I should be able to verify if it's actually crashing by seeing if I can ping It or beat it into submission in the console this afternoon.
 
Back
Top Bottom