Machine build with local ai

Associate
Joined
17 Sep 2025
Posts
6
Location
Falkirk
I want to setup a local ai assistant to help me with coding and design. What are the best and more importantly no cost options available out there. I dont want the limitations of using cloud ai providers.
 
Ram ram and more ram. That allows the largest models.

CPU/gpu plays a part too in response time or output speed… but getting into that can start costing.
 
Last edited:
I'm working on a new build at the moment and initially it was going with mini-atx and two ram slots, so I bought 128GB.

I've put most of the bits together, but then found an ATX case that would fit where I want it, so I'm trying to decide if I should ditch the micro and go full atx so I can have 256GB ram to last me a few years and be able to dabble with local ai.

Not sure it's worth it though? Would the bump be worth it you reckon?
 
I'm working on a new build at the moment and initially it was going with mini-atx and two ram slots, so I bought 128GB.

I've put most of the bits together, but then found an ATX case that would fit where I want it, so I'm trying to decide if I should ditch the micro and go full atx so I can have 256GB ram to last me a few years and be able to dabble with local ai.

Not sure it's worth it though? Would the bump be worth it you reckon?
You don’t need 128gb, let alone 256 to dabble with local AI. Even the 7b models are fine for messing with, and any CPU/RAM setup is going to be glacially slow compared to something that’ll fit on a GPU.

Once you’re spending that sort of money you might as well take advantage of the providers loss leading on frontier models and let them do the leg work.
 
Well I wanted 128GB anyway as I frequently max out my 32GB in Adobe products.

I tested Topaz a while back and they shift to using your normal ram when the video card caps out, so I figure the more of that you have the better.

I've seen vids on youtube of people with Macs and 512GB of RAM, so....there are ways to fill it up if you try! It'll be a while probably before I bother to get a 5 series graphics card though.

I already spend a lot on the frontier models every month through work.
 
Well I wanted 128GB anyway as I frequently max out my 32GB in Adobe products.

I tested Topaz a while back and they shift to using your normal ram when the video card caps out, so I figure the more of that you have the better.

I've seen vids on youtube of people with Macs and 512GB of RAM, so....there are ways to fill it up if you try! It'll be a while probably before I bother to get a 5 series graphics card though.

I already spend a lot on the frontier models every month through work.
You end up with single digit tokens per second when running large models on CPU. The Mac Studio/Mac air aren’t bad options as the unified memory means the GPU has direct access to system memory….although to load a Mac Studio up costs megabucks (£10k+).

The nVidia DGX spark started shipping last week. Probably the best option if you are serious about training models.

Stuffing as many 4090/5090s into a PC as you can is still king for inference.
 
Last edited:
Yeah, I saw a vid on the spark the other day, something like £3k - £4k!

I'll end up getting a beefy GPU at some point. It's just a shame I don't game anymore, so unfortunate I wouldn't really utilise it for that.
 
Bit late to this thread but FWIW any of the Strix Halo 128GB builds are pretty decent for this sort of stuff. Corsair have the top of the range (Max+ 395 128GB) box for £2k or so.

Even something simple like Lemonade server on Windows works fine - via Vulkan although you can get more out of ROCm on windows with rdna kit now.

Runs something like gpt-oss-120b-mxfp-GGUF OK although you'd be better off running that on Linux until everyone catches up with the fact that yes RDNA gpus with >64GB memory ARE now addressable above 64GB.

Its Windows and frankly I'm amazed it works as well as it does but its using the 8060S GPU (CPU usage at 5%) with Vulkan and churning out decent results on a 120W power draw. Its a monster that won't cost you a fortune to run.

Also 128GB of quad channel 8000MT DDR5 on the APU doesn't look as expensive as it did at the start of the year ;)
 
Had a little time to play and the Corsair box (AI Workstation 300) is slightly faster than the DGX Spark on most everything other than image/video generation where you definitely want CUDA at this point in time due to software support.

ROCm is catching up quick though.

The only* comparison I found which put them head to head was this :


The only thing I'd add to that video is that if you use one of the hybrid models (on Windows) then it uses the NPU for prompt processing which speeds up that part of his testing to about the same as the DGX Spark. The rest is offloaded to GPU as before.

Given its £2.2k for the Corsair box (Strix Halo) versus £3.6k for the DGX Spark (both 4TB storage) and the Strix Halo is a perfectly usable x86 machine** then I think your workflow would have to be majority image/video generation to justify the cost differential. Whether £1.2k is worth it for the more mature CUDA software stack - depends on individual usage/patience/dislike of giving NVidia even more money ;)

*there's a few threads about - like over on the Framework forums but I think most of that is quite old in terms of builds.
**games like a RTX4060-4070 and has 128GB/s memory bandwidth to 16 Zen5 CPUs, GPU is 256GB/s and has the MALL cache dedicated to it.
 
Last edited:
I want to setup a local ai assistant to help me with coding and design. What are the best and more importantly no cost options available out there. I dont want the limitations of using cloud ai providers.

Out of interest what limitations are you trying to avoid here?

If privacy then the API data isn't saved/used for training (by default) - web data is saved by default but can be configured to not do that.
 
Back
Top Bottom