Dead Epyc Replacement Conundrum

Soldato
Joined
17 Jul 2005
Posts
9,716
Hi all,

Going round and round in circles at the moment in my head on what to replace my dead SuperMicro H11SSL-i with (Paired with a Epyc 7402P with may have also been killed - it just turned off whilst I was remoted in an now refuses to show any signs of life (multiple tests done with Epyc 7551p, Different RAM and 3 different PSU's with the same results)).

As it stands I use(d) the Epyc system as a bit of a play ground for both learning on and testing different VM configurations - 1 CPU, 2 gamers, Windows Server with Domain logons (getting a better understanding our work systems), Direct PCI-E passthrough to VM's, that sort of thing. It was also paired with two GPU's, multiple NVME and SATA SSD's which is the main point where I am scratching my head over what to do.

I know I can buy a replacement Epyc board (plus maybe the CPU, can't confirm unless I bought another board and tested it) or replace it with a used Threadripper TRX40 system (I want at least Zen 2 performance based on my 7402P vs 7551P experience), but I am wondering if a specific "mainstream" option may suffice with modifications to my existing setup.

Currently I run:

SuperMicro H11SSL-i MB
Epyc 7402P (24/48 - basically a lower clocked 3960X)
128GB DDR4 2666MT/s ECC
RTX 4070 Primary GPU (Game VM 1)
GTX Titan (Proxy) Secondary GPU (Game VM 2)
4 x 1TB NVME on Asus M2 Expander card (requires full X16 slot with x4x4x4x4 Bifurcation)
2 x 512GB NVME on 8x Dual M2 Expander card (requires x8 slot with x4x4 Bifurcation) - Game VM Boot drives
2 x 512GB Sata SSD (Proxmox Boot)
2 x 1TB Sata SSD (Other VM Boot such as Windows Server)


I am thinking that I can take advantage of X670E and its 4 x NVME slots (I think you can do that with B650E as well but not sure if there are more limitations) by dropping the Expander card and tweaking my setup to remove the addition 2 x 512GB NVME drives (might be able to mount these as external drives via USB albeit at a performance penalty). This would be combined with a 16c/32t CPU to give me enough (just) cores to play with.

However, I would still like to run two GPU's where possible.

Are there any boards which support dual X16 PCI-e slots (running @ x8/x8 which is fine for my purpose) whilst also allowing for full function (if it drops to x2 that is fine) on all 4 NVME slots and having at least 4 x SATA ports available?

From looking I can see something like the MSI X670E Carbon has, on paper at least, what I am looking for but I can't see in the specifications if there are any limitations when running dual GPU (@ x8) and all 4 NVME at the same time. Normally I would expect to see if X then Y is not possible in the manual but I couldn't see anything after a quick skim. The Strix E-E also looks like it has similar features (specifically x16 slot spacing) whilst being in a similar ball park price wise. Other boards like the Crosshair are too expensive (may as well go WRX80 at this point) or don't have the correct slot spacing (Aorus Master).

Cost is also a consideration as I can get a 3960X / TRX40 bundle for around £700 on Ebay so need to factor in the total cost of anything new (as I would need DDR5 as well for AM5 vs just using the existing ECC Ram I have for Threadripper). I would say overall budget is <£1,000 if new and needing new RAM as well. Anything less is obviously a bonus.

I would be looking to pair this with a 7950X (or upcoming AM5 chips) thus would see a significant improvement to single core performance but also need to factor in total core availability. I am primarily looking at AMD as the cores are all equal and IIRC they have more PCI-E lanes available (or at least the setup on the boards is more conducive to my requirements).

Networking wise having 10Gb would be a nice bonus but 2.5Gb will be fine for our use case.

Effective end goal would be to have a system I can still tinker on but would be a primary gaming platform for 2 x users (VM's) and a storage server (with my existing NAS acting as a backup). Everything ideally needs to fit into a standard ATX form factor case as I don't have the room for a blade style server and nor do I want to have to deal with the noise.


Or, am I being daft and really need to look at HEDT as anything "mainstream" will cause me too many headaches with resource allocation etc??


Sorry for the long post but I am really struggling getting my head around what I should do. :)
 
Last edited:
I am thinking that I can take advantage of X670E and its 4 x NVME slots (I think you can do that with B650E as well but not sure if there are more limitations) by dropping the Expander card and tweaking my setup to remove the addition 2 x 512GB NVME drives (might be able to mount these as external drives via USB albeit at a performance penalty). This would be combined with a 16c/32t CPU to give me enough (just) cores to play with.

However, I would still like to run two GPU's where possible.

Are there any boards which support dual X16 PCI-e slots (running @ x8/x8 which is fine for my purpose) whilst also allowing for full function (if it drops to x2 that is fine) on all 4 NVME slots and having at least 4 x SATA ports available?
I think you'll find that B650E has to drop the CPU down to run 4x M.2 slots, but you should be able to run 4x M.2, 4x SATA and 8 lane/8 lane on the graphics with X670E.

Networking wise having 10Gb would be a nice bonus but 2.5Gb will be fine for our use case.
I'd look at the X670E Creator.

I would be looking to pair this with a 7950X (or upcoming AM5 chips) thus would see a significant improvement to single core performance but also need to factor in total core availability. I am primarily looking at AMD as the cores are all equal and IIRC they have more PCI-E lanes available (or at least the setup on the boards is more conducive to my requirements).
They have 2x CPU connected M.2 slots, which can be PCI-E 5.0, whereas Z790 only has 1x CPU connected M.2 slot and that is PCI-E 4.0.

Or, am I being daft and really need to look at HEDT as anything "mainstream" will cause me too many headaches with resource allocation etc??
One thing to be aware of is that running 4 sticks of RAM (especially dual rank RAM, which 32GB would be) is going to be hard on your CPU's memory controller and that will drop the speed down by quite a lot. We don't know if the newer CPUs (9000) are going to be better at handling that, but AMD's official spec is still just 3600 for 4 single or dual rank sticks.

I'm also not sure what the availability of ECC DDR5 is like.

I haven't seen much availability on the Epyc 4004 CPUs, but you could look for those too:
 
Thanks for the detailed reply @Tetras.

I have been looking at it a bit more and it does look like B650E would meet my current requirements barring the split on the second PCI-E 16x slot. Looking online it appears the B650E-E Strix board splits the slots as x8/x4 rather than x8/x8 of the X670 equivalent. Given I am not going to be running anything crazy high end this might be sufficient for our needs but I am concerned that having only 4 x SATA may present an issue in the future (say if I want to add a larger HD array for snapshotting or just want more drives in general for redundancy). Nothing set in stone there but something I need to consider in terms of longevity.

The ~£100 saving over the X670E is quite nice though I will admit!.

RAM wise I am thinking I will just go 64GB for now and I am not too bothered if the speed gets reduced as the "gaming" machines will be VM's anyway thus there will always be an element of overhead eating away at the edges of performance (Up until my current board died I had the 4070 paired with effectively 16 Zen 2 threads running at ~3Ghz, the bottleneck was there but honestly didn't really cause any issues in practice). If I went with the B650E board I could use the money saved to up the RAM to 96GB but will need to figure out exactly how much I need vs would like. Not fussed by ECC when it comes this build, its just want I had with my Epyc system. Nothing on this build is what I would consider mission critical thus if something went awry due to the lack of ECC I would just reset / re-build etc.

Just had a quick look at the pricing on the Epyc 4xxx series CPU's. Over £700 for the 16 Core vs ~£470 for the 7950X is a killer (ignoring that the 7950X can be had for under £400 used as it currently stands).

Waiting for the 9950X is also an option. Might even bring prices of the 7950X down making it even more attractive.
 
Thanks for the detailed reply @Tetras.

I have been looking at it a bit more and it does look like B650E would meet my current requirements barring the split on the second PCI-E 16x slot. Looking online it appears the B650E-E Strix board splits the slots as x8/x4 rather than x8/x8 of the X670 equivalent. Given I am not going to be running anything crazy high end this might be sufficient for our needs but I am concerned that having only 4 x SATA may present an issue in the future (say if I want to add a larger HD array for snapshotting or just want more drives in general for redundancy). Nothing set in stone there but something I need to consider in terms of longevity.

The ~£100 saving over the X670E is quite nice though I will admit!.

RAM wise I am thinking I will just go 64GB for now and I am not too bothered if the speed gets reduced as the "gaming" machines will be VM's anyway thus there will always be an element of overhead eating away at the edges of performance (Up until my current board died I had the 4070 paired with effectively 16 Zen 2 threads running at ~3Ghz, the bottleneck was there but honestly didn't really cause any issues in practice). If I went with the B650E board I could use the money saved to up the RAM to 96GB but will need to figure out exactly how much I need vs would like. Not fussed by ECC when it comes this build, its just want I had with my Epyc system. Nothing on this build is what I would consider mission critical thus if something went awry due to the lack of ECC I would just reset / re-build etc.

Just had a quick look at the pricing on the Epyc 4xxx series CPU's. Over £700 for the 16 Core vs ~£470 for the 7950X is a killer (ignoring that the 7950X can be had for under £400 used as it currently stands).

Waiting for the 9950X is also an option. Might even bring prices of the 7950X down making it even more attractive.
The ASUS B650E-E also has a 4x Gen 4 PCI-e slot(bottom slot), could plug a SATA card in. One thing I am not 100% on is if the fourth M.2(4x gen 5) and the second 16x slot(4xgen 5) can both be used at the same time. One review says it can which would be great but I dont no for sure?
 
Not 100% on that either which is why I was leaning more towards X670E if I went this way to avoid any potential issues. I had also seen online a few people post / mention that they had trouble getting a second GPU to show up using the B650E-E board. By all rights it should work (albeit at x4 speed) but there obviously something preventing it, maybe software or maybe hardware related.

I also saw posts with similar issues on the Crosshair Hero as well but that may have been a specific issue with that board in question as the platform should support dual GPU (x8 mode),

Doing more digging around PCIE-e passthrough it looks like I have had hit a snag thinking about using X670E/Z790 (if Intel) anyway.,... IOMMU groups when it comes to the NVME slots off of the Chipset. You can generally only passthrough PCIE-e slots if they are in different IOMMU groups. Looks like the ones off the chipset will be in the same group which would cause further headaches. I could work around it by not using the chipset NVME slots but I am now thinking that Threadripper is looking more like the option I should go. If nothing else it removes the extra headaches around resource splits / allocation.
 
The ASUS B650E-E also has a 4x Gen 4 PCI-e slot(bottom slot), could plug a SATA card in. One thing I am not 100% on is if the fourth M.2(4x gen 5) and the second 16x slot(4xgen 5) can both be used at the same time. One review says it can which would be great but I dont no for sure?
Asus stole 8 lanes from the graphics card to provide 4 lanes for the 2nd PCI-E 5.0 M.2 slot, which left only 4 lanes remaining. According to the manual, they have rewired those 4 lanes to the second full length PCI-E slot, but this is unusual because the CPUs are listed as supporting 16 or 8/8 mode, not 8/4/4 (AMD's own slides show 16 or 8/8).

I guess it comes down to who do you trust? If you can't trust the manual, then I don't know what other option we have.

Doing more digging around PCIE-e passthrough it looks like I have had hit a snag thinking about using X670E/Z790 (if Intel) anyway.,... IOMMU groups when it comes to the NVME slots off of the Chipset. You can generally only passthrough PCIE-e slots if they are in different IOMMU groups. Looks like the ones off the chipset will be in the same group which would cause further headaches. I could work around it by not using the chipset NVME slots but I am now thinking that Threadripper is looking more like the option I should go. If nothing else it removes the extra headaches around resource splits / allocation.
It is a shame that affordable HEDT doesn't exist anymore, the limitations on lanes and slots on consumer boards are really annoying.
 
Asus stole 8 lanes from the graphics card to provide 4 lanes for the 2nd PCI-E 5.0 M.2 slot, which left only 4 lanes remaining. According to the manual, they have rewired those 4 lanes to the second full length PCI-E slot, but this is unusual because the CPUs are listed as supporting 16 or 8/8 mode, not 8/4/4 (AMD's own slides show 16 or 8/8).

I guess it comes down to who do you trust? If you can't trust the manual, then I don't know what other option we have.


It is a shame that affordable HEDT doesn't exist anymore, the limitations on lanes and slots on consumer boards are really annoying.
I don't see it as ASUS "Stole" lanes, it just an optional config that I was aware of before I got it.
The only question for me is can I use both, someday I will try it and find out.
 
Asus stole 8 lanes from the graphics card to provide 4 lanes for the 2nd PCI-E 5.0 M.2 slot, which left only 4 lanes remaining. According to the manual, they have rewired those 4 lanes to the second full length PCI-E slot, but this is unusual because the CPUs are listed as supporting 16 or 8/8 mode, not 8/4/4 (AMD's own slides show 16 or 8/8).

I guess it comes down to who do you trust? If you can't trust the manual, then I don't know what other option we have.


It is a shame that affordable HEDT doesn't exist anymore, the limitations on lanes and slots on consumer boards are really annoying.

Honestly it's the price which is making me look at other options than just going TRX40. Just feels wrong to be paying ~£700+ for a Zen 2 based combo which is bettered in basically every way by a modern mainstream platform other than the PCI-E lane availability.
 
To update the thread I have ended up with a compromise build for now.

Did more research on boards that I care to think about (mainly in the back end IOMMU groupings) and ended up with:

Asus B550 Pro Art Creator (Has dual 8x PCI-E lanes + decent IOMMU group splitting AND was £160...)
Ryzen 5950X (this one is obvious as its the only option that works for my use case on AM4 & was £260)
64GB DDR4 3600C16 - Vengeance LPX (annoyed by this one as I was hoping the ECC RAM I had to hand would work, it very much did not work and changing a setting in the bios caused it to no longer recognise my test RAM, luckily the board has bios flashback...)

Current plan is dual 6 core gaming VM's with direct PCI-e passthrough with one running off a NVME and the other a SATA drive (biggest limitation of non AM5/TR is the lack of CPU connected NVME*). I will then have the other NVME and SATA ports acting as storage pools for the VM's and/or another small VM acting as a test Storage server.

*I could always run my dual NVME card in the second PCI-E x16 slot (@ x8) with Bifurcation and run Paravirtualization on the GPU to split it between two VM's...

Separately I will either look to replace my Epyc board or go with a first Gen TR system (X399) to fully replace my Epyc system. With the 5950X system I am less bothered by raw CPU performance thus can afford to go with a slower setup. Plan here is full DC/AD server with storage server and Compute VM's (testing only).

Lots to plan and tinker with so I am happy. :)

For anyone wondering the same solution on AM5 was going to run around £1,000 for a 7950X, MB with the features I wanted and 64GB DDR5. Yes, the AM4 system I now have is slower and yes, it has less options to play with (although in practical terms its only the additional 1 x CPU linked NVME that it is missing - the extra 2 NVME on AM5 are from the chipset thus in the same IOMMU group and either not possible to passthrough or a PITA to deal with), but at half the price it was a hard sell going for AM5.
 
Last edited:
(annoyed by this one as I was hoping the ECC RAM I had to hand would work, it very much did not work and changing a setting in the bios caused it to no longer recognise my test RAM, luckily the board has bios flashback...)
You're aware that your Epyc uses registered ECC RAM and consumer Ryzen only supports unbuffered ECC, right?
 
You're aware that your Epyc uses registered ECC RAM and consumer Ryzen only supports unbuffered ECC, right?

Ah that would explain it, I didn't realise there were two variants of ECC RAM, the manual for the board just states ECC and Non ECC Unbuffered memory but I had missed the comma after Non ECC.

Not sure how I missed that tbh as I had the memory directly in front of me... :o

No harm, no foul as I now have 64GB of much faster DDR4 anyway.... by accident of course...:D

(EDIT: Looks like TR is the same in only supporting UDIMM and not RDIMM, at least that makes my board replacement easier..)
 
Back
Top Bottom