I'm wondering if anyone can tell me what kind of VRAM bandwidth is suitable for LLM code completion (talking about something in the style of cursor tab here, but locally hosted if such a thing exists).
I'm looking at picking up a new GPU primarily for gaming, but would also like to try out llm code completion features (absolutely not interested in any kind of interactive/chat/agent mode code generation).
Budget is tight so I'll likely be looking at a 9060XT 16GB or a 5060Ti 16GB.
ROCm / CUDA and compatibility issues aside is the 322GB/s bandwidth on the 9060XT suitable compared to the 448GB/s on the 5060Ti? Is there some kind of rule of thumb for how much bandwidth is required to run these models quickly?
Thanks!
I'm looking at picking up a new GPU primarily for gaming, but would also like to try out llm code completion features (absolutely not interested in any kind of interactive/chat/agent mode code generation).
Budget is tight so I'll likely be looking at a 9060XT 16GB or a 5060Ti 16GB.
ROCm / CUDA and compatibility issues aside is the 322GB/s bandwidth on the 9060XT suitable compared to the 448GB/s on the 5060Ti? Is there some kind of rule of thumb for how much bandwidth is required to run these models quickly?
Thanks!