4 sticks of RAM perform better than 2?

Soldato
Joined
20 Aug 2019
Posts
3,046
Location
SW Florida
https://youtu.be/vzmF10NZrdw
I noticed this on my new 3800X build when I added my second 2-stick kit and ran a couple benchmarks.

Prior to getting the second kit, I had managed to get my latency down around 65ns, but I went back to default when I added the second kit (because I didn't expect the 2-stick overclock to work on 4 sticks). My latency went back up to 69-ish (3600/C15 stock), but my FPS in 3D-Mark went *up*. My 3800X picked up 300 points by adding the two extra sticks of RAM...even though AIDA64 clearly showed more latency .

I thought it might be an anomaly unique to 3D-Mark, so I downloaded Shadow of the Tomb Raider and ran the built in bench mark. -Same thing. 4 sticks got higher frame rates than two.

The comments section of the video mentions "rank interleaving"?

Is that a thing? I was heavy into this stuff when I built my last rig (3770k) years ago, but never really dug into the RAM side of the performance equation. With this new build, and the 3000 stuff basically turning CPU's into self-overclocking GPU's, I turned to ram for a little extra performance and this 4-stick thing seems to run contrary to conventional wisdom.
 
Last edited:
This is how I understand it (happy to be corrected): rank interleaving AIUI is on-module interleaving for two-sided modules - where each rank is a 64-bit addressable space. One rank (side of the module) refreshes while the other's on its access cycle, giving better total performance. AIUI as modern CPUs are geared for triple- or quad-channel, rank interleaving helps again.

When you're running quad channel, the memory controller can interleave across ranks on a channel as well as across channels. For best performance, it's advised to add matching memory in sets of 4.

https://frankdenneman.nl/2015/03/02/memory-deep-dive-summary/ is a mindbogglingly detailed read on this (it also has pictures). I periodically revisit it, each time convincing myself I understand it better than last time... then I doubt myself and reread it again. (and again...) CPU and memory architecture is complex.

The sections concerning rank and channel interleaving, DIMMs per channel (if you're getting into high capacity systems) and the DDR4 article are very interesting if you're into this kind of stuff.


I noticed quad channel improved FPS for me a bit, and put this down to the benefits of interleaving and access parallelization. But I'm starting to experiment with performance vs. potential max capacity as I'm topping out the RAM on this machine a lot. As part of some troubleshooting this evening (add more RAM: machine won't POST) I've tried various combinations of Corsair Vengeance 2666 Mhz, multiple capacities, using their XMP2.0 profile.


Tangentially related:
I know it's apples and oranges, but ignoring size these sticks are notionally the same spec, latency and voltage (of course potential differences in controller, internal latency and size). Might be interesting to see a quick and dirty comparison. Read what you will into these two benchmarks.

4x4 GB, in A1 B1 C1 D1 slots
Z08a3jB.png

2x16 GB, in A1 and C1 positions
AwjbeWr.png
 
When you're running quad channel, the memory controller can interleave across ranks on a channel as well as across channels. For best performance, it's advised to add matching memory in sets of 4.
Desktop platform CPUs have dual channel memory controller always working at most with two channels.
But four single rank DIMMs gives two ranks per channel allowing interleaving inside single channel.
Same interleaving can be also achieved using usually bigger capacity dual rank DIMMs with one per channel.


You don’t say what motherboard you have, but ASRock are running something called T-Toplogy on their X570 boards which is optimised for a fully populated motherboard.
DIMM slot topology doesn't affect any to logical level operation of memory.
It affects only to physical level connectivity.
T-topology gives better signaling for four DIMMs than daisy chain, allowing higher maximum frequency at physical level.
(for two DIMMs it again has lower physical level maximum frequency than daisy chain)

But that doesn't affect to memory controller itself.
Four DIMMs is always heavier load for memory controller, limiting bus speed it can achieve to lower than with two DIMMs.
 
I read a lot about RAM leading up to this build and got the impression that two sticks was better than four. (From a FPS perspective, and as long as you have enough capacity)

The idea the four sticks could perform better than two, in any circumstance, just wasn't something that was on the radar.

I don't think the benifit of interleaving is covered enough. (Either with two dual-rank sticks, our four singles)
 
nice link
toms are one of few review sites who dedicated several reviews to single-rank vs dual-rank and 2 vs 4 stick questions

and the benefit from that is sometimes surprisingly large, but very much application/scenario specific
 
You don’t say what motherboard you have, but ASRock are running something called T-Toplogy on their X570 boards which is optimised for a fully populated motherboard. It’s mentioned in the memory VRM section of the Gamers Nexus strip-down of the X570 Taichi.

https://youtu.be/OUtvsAmD3Ws

Seems this is actually not the case on the retail boards, that MB was an engineering sample which was built differently:

https://www.youtube.com/channel/UCrwObTfqv8u1KO7Fgk-FXHQ/community
 
Back
Top Bottom