• Competitor rules

    Please remember that any mention of competitors, hinting at competitors or offering to provide details of competitors will result in an account suspension. The full rules can be found under the 'Terms and Rules' link in the bottom right corner of your screen. Just don't mention competitors in any way, shape or form and you'll be OK.

384bit vs 512bit memory bus Tomb Raider shootout

Man of Honour
Joined
21 May 2012
Posts
31,941
Location
Dalek flagship
Just for a bit of fun I have done some tests using the Tomb Raider benchmark to compare the memory buses on the Titan (384bit) against the R290X (512bit).

The PCs used were near identical with

Mobo = RIVE
Ram Corsair Dominator 2400mhz 9-11-11-25
CPUs 3970X and 3930k both @4.0ghz

To test what I did was

First set the VRAM on both cards to 1251 (this is the lowest I could set on the Titan).

Next I ran the TR bench @2560 x 1600 (I used this resolution to max the VRAM workload) on both the PCs and overclocked the Titan on the GPU core until both the Titan and 290X were scoring the same (25.4fps).

For the actual tests the only changes I made was to increase the VRAM speed on both cards by the same amount to compare what increase in fps I got.

The reason I did this was in theory the 512bit bus should give more bandwidth and performance for any given increase in VRAM clock.

R9 290X starting settings

bha4.jpg



Titan starting settings

hzwh.jpg



Results with the VRAM @1251mhz (starting point)

290X
qz7x.jpg


Titan
jr1w.jpg



VRAM now at 1350mhz

290X
k06v.jpg


Titan
tag0.jpg



VRAM now at 1450mhz

290X
ksrf.jpg


Titan
94y0.jpg



VRAM now at 1550mhz

290X
speq.jpg


Titan
um6b.jpg



VRAM now at 1600mhz

290X
spdc.jpg


Titan
3tib.jpg



Please remember this was just one game and one resolution.

Below is a summary of the results in the pics


VRAM @1251mhz 290X 25.4, Titan 25.4

VRAM @1350mhz 290X 26.2, Titan 26.1

VRAM @1450mhz 290X 26.5, Titan 26.8

VRAM @1550mhz 290X 26.8, Titan 27.0

VRAM @1600mhz 290X 26.9, Titan 27.2


One possible reason for the Titan pulling ahead is that error correction could have come into play with the 290X.
 
Last edited:
In theory i would say the Titan would gain a bit more in this testing due to it needing more bandwidth at those low clocks. At your starting clocks the titan is at 240 gb/s which is lower than its usual 280 gb/s where as the r9 290x will be less bandwidth limited as it has it's usual 320gb/s. The titan is also overclocked on the core so in theory as the memory bandwidth goes up it should benefit even more.
 
Well this game doesn't seem to respond well to memory clock increases. :)

Thanks Kaap, interesting to see.

The increases look small because of the resolution used but going from 25.4 to 27.2 is nearly 7.1% increase in fps.

I wanted to use 1600p to give the memory buses some real work to do.
 
Interesting Kaap, thanks for posting.

Pretty close. I wonder if there is a tipping point at which a higher memory bus provides benefit over a slower bus with more VRam and at what res that might be.
 
In theory i would say the Titan would gain a bit more in this testing due to it needing more bandwidth at those low clocks. At your starting clocks the titan is at 240 gb/s which is lower than its usual 280 gb/s where as the r9 290x will be less bandwidth limited as it has it's usual 320gb/s. The titan is also overclocked on the core so in theory as the memory bandwidth goes up it should benefit even more.

Depends whether the bandwidth is saturated though. If it isn't, increasing the clock speed won't increase FPS proportionately more than a narrower bus card all that much.
 
Interesting Kaap, thanks for posting.

Pretty close. I wonder if there is a tipping point at which a higher memory bus provides benefit over a slower bus with more VRam and at what res that might be.

I think you would need to look at 4K and above to go with the slower vram wider bus solution. But that is only my guess, like all things it needs testing.
 
Depends whether the bandwidth is saturated though. If it isn't, increasing the clock speed won't increase FPS proportionately more than a narrower bus card all that much.

Yea that's true but there's a good chance as you are starting from a really low point in bandwidth on the titan where as on the r9 290x you are starting from normal which is very high in the first place.

Good info though kaap.
 
interesting, nice one kaap.

Now all you need to do is run the whole thing a few more time to see if the results are consistent or within the margin of error. /joking

Trying a different game next might be a good idea though if you really want to push the boat out :D
 
Yea that's true but there's a good chance as you are starting from a really low point in bandwidth on the titan where as on the r9 290x you are starting from normal which is very high in the first place.

Good info though kaap.

Not sure that they're being bandwidth limited to be honest (the Titan at stock).

If they were you'd expect them to gain more from overclocking the memory whereas it's fairly constant.

The default bandwidth on the Titans is pretty high still.
 
Not sure that they're being bandwidth limited to be honest (the Titan at stock).

If they were you'd expect them to gain more from overclocking the memory whereas it's fairly constant.

The default bandwidth on the Titans is pretty high still.

He does not start of at default on the titan though. He starts at 1250 the same as the 290 but at 1250 on the titan thats only 240 gb/s compared to 320 gb/s on the 290.
 
He does not start of at default on the titan though. He starts at 1250 the same as the 290 but at 1250 on the titan thats only 240 gb/s compared to 320 gb/s on the 290.

I just had a quick go on the Heaven 4 bench @1600p as well but this time it was the 290X that had to be overclocked on the core to match the Titan at stock. Again I tested the VRAM @1251mhz and 1600mhz. The performance increase by raising the VRAM speed on both cards was within a single point of each other.

The only thing I can think is the width of the bus is playing no part at this resolution (1600p) on Heaven 4, only raw VRAM mhz.
 
Nice one Kaap! If you have some free time would you be able to do some other benches/games? The 290 is looking really, really good. Just waiting on the aftermarket coolers to come out.
 
Although theres two architectures being compared, the memory chips are going to have differnt timings, whilst the scaling of the memory overclocks for each card will be shown, I'm not sure if bandwidth limitations are to be seen unless higher resolution than what was tested or multimonitor comparisons. Certainly interesting thread, needs more work but def interesting, also are you able to log the gpu statistics from process explorer to compare
 
Nice work Kaap but it has to be said its pretty obvius the Bus Width plays an important part in performance. :)

You only have to look at the GTX 660TI vs the GTX 670, both have exactly the same GPU core and yet the GTX 670 is faster clock for clock, the only difference between them is the 660TI has a 192Bit Bus while the 670 has a 256Bit Bus.

Its the same on AMD cards, 7870XT (1536 SP 256Bit) vs 7950 (1792 SP 384Bit) the 7950 beats the 7870XT by 20% while the 7970 (2048 SP 384Bit) only beats the 7950 by about 5% despite have a similar number more SP's than the 7950 has when compared with the 7870XT, the real difference is again the Bus width. and they are all Tahiti GPU cores. :)

Its a combination of both the memory speed and the Bus width, what matters most is the Bus width, the wider it is the less chance of the Memory speed getting choked. like the 7870XT and GTX 660TI.

A Good test is the see how they compare at increasing resolutions.

A 290/X with 1750Mhz rated Memory IC's would be an absolute monster. At 1750Mhz the Memory bandwidth would be running at about 450Gbs, that alone would increase the performance by about 20%.
 
Last edited:
Nice work mate. I honestly don't think the bus width on the 290x is even remotely touching the sides. You need massive VRAM clocks.

I know this is just one test, but it makes sense.
 
Worth bearing in mind that the interface for each memory chip (the black squares on your gcard's PCB are a clump of chips more "properly" referred to as a module) is 64 bits wide. So if you need a contiguous block of data from 1 chip, I believe it may have to be retrieved 64 bits at a time.
 
Kaap can you do the same bench with the two cards again using the same settings and methods but with 0xAA, so that it can be compared against the 4xSSAA results?

The reason for this is to see if higher memory bandwidth does help reduce the amount of frame rate drop in % when applying higher level AA. If the Titan has quite a bit higher frame rate on 0xAA, that would mean higher memory (on the 290x) bandwidth does reduce the amount of performance hit.
 
I have the Benchmark app of that ^^^ game

7870XT:

1175 / 1250

Resolution: 1920 x 1080
Texture Quality: 2
Shadow Quality: 3
Anisotropic Filtering: 16
SSAO: ON
Vertical Sync: OFF
DX11 Tessellation: ON
DX11 Advanced Shadows: ON
DX11 MSAA Samples: 1


Benchmark Summary:

Number of frames: 8687
Average Frame Time: 12.1ms
Average FPS: 82.9

1175 / 1563

Resolution: 1920 x 1080
Texture Quality: 2
Shadow Quality: 3
Anisotropic Filtering: 16
SSAO: ON
Vertical Sync: OFF
DX11 Tessellation: ON
DX11 Advanced Shadows: ON
DX11 MSAA Samples: 1


Benchmark Summary:

Number of frames: 9467
Average Frame Time: 11.1ms
Average FPS: 90.3
Memory clock: +25%
Performance difference: +10%
 
Back
Top Bottom