PowerColor 6970 pictured and benched

BS Dave · 13 Dec 2010 at 15:35

***laughs***

Duffman - can you do us a summary?

Gerard · 13 Dec 2010 at 15:36

Walls of text for the lose

Arcane · 13 Dec 2010 at 15:39

Someone needs to play some games rather than "Mavis Beacon Teaches Typing".

Gerard · 13 Dec 2010 at 15:41

I don't understand how he manages to make every reponse a novel, doesn't matter how small the point he always goes overboard =/

RavenXXX2 · 13 Dec 2010 at 15:42

He's showing some passion and enthusiasm for the subject matter, I respect that.

RedStarRiot · 13 Dec 2010 at 15:43

Duffman has some homework to read over lol. I usually enjoy drunkenmasters post as they usually are pretty imfomative but man they are long, especially on my HTC Desire!

Gibbo · 13 Dec 2010 at 15:45

I am seeing far better results than this Powercolor card!

BS Dave · 13 Dec 2010 at 15:55

go on, go on...

AMD must have made some phone calls right? everyone's starting to spill the beans...

pratkal · 13 Dec 2010 at 15:59

Gibbo said:
I am seeing far better results than this Powercolor card!

thats a life savor any hint for price?

Defiant306 · 13 Dec 2010 at 16:00

BS Dave said:
go on, go on...

AMD must have made some phone calls right? everyone's starting to spill the beans...

TBH in my view they needed to as it was getting slatted badly for the last 3-4 days.

Also I second the comment about benchmarking programs, give us game results as you cant play benchmarking programs. They are just for epeen spectators

RavenXXX2 · 13 Dec 2010 at 16:09

Gibbo said:
I am seeing far better results than this Powercolor card!

He's on AMD CPU, I take it Gibbo is using an i7, wonder if it makes much difference.

pratkal · 13 Dec 2010 at 16:14

RavenXXX2 said:
He's on AMD CPU, I take it Gibbo is using an i7, wonder if it makes much difference.

in gaming i do not see any high advantage from amd x4 955BE to i7 750

RavenXXX2 · 13 Dec 2010 at 16:28

Depends on the game really. I think it's accepted i5/i7 CPU's are overall better for gaming.

ashmanuk67 · 13 Dec 2010 at 16:38

Gibbo said:
I am seeing far better results than this Powercolor card!

may sound dumb but are you saying that you are comparing two different manufactures of the 6970 ?
if so one is powercolor and the other is ?

Duff-Man · 13 Dec 2010 at 16:50

BS Dave said:
***laughs***

Duffman - can you do us a summary?

Summary
He's saying the exact same thing as before... That (somehow) the removal of the special function unit (i.e. the change to a "4D" shader grouping rather than "5D") is more of a dramatic change than the reworking of the entire pipeline process in Fermi. This is most certainly not a viewpoint shared by anyone else in the hardware community, and when you look at the list of changes made to the architecture in Fermi (that I describe here) then you can see why.

His reason for dismissing the massive changes in Fermi is that "they do the same job even though they have been broken down and moved around".

------- response to "points" ---------

Well anyway, to counter these points briefly (I have no intention of getting into an essay-writing contest on a point that nobody else but him has any doubt over):

1. There are only a few basic operations that a modern GPU need to perform. So from that point of view ALL GPUs perform the same basic operations.

2. He asks: "do you think the core logic will send instructions in the same manner to two separate types of shaders, in a 5 way cluster, as it will a 4 way cluster with 4 identical shaders".

The answer is "for the most part, yes". There are differences to be sure, but from the point of view of data-structure they are not so dramatic as you might think. For example, when a transcendental operation is encountered, instead of sending an instruction to the special function unit, three of the "generalised" shaders are put to work operating on the transcendental. The bigger change takes place within the shader itself. For those of you with programming experience, think of it as swapping out one function for another. The code inside the function changes (c.f. the operation of the shaders), but the code to call it changes very little (c.f core logic).

The really dramatic changes come into play when you change the flow of data (i.e. alter the pipeline). Here you change the nature of the data you're passing around (like rewriting the "main" file in the programming analogy), and so everything that interfaces with it must also be altered. In terms of changes to the shader pipes, changing to a "4D" architecture is not too different to when Nvidia removed the "dangling float" from their GT200 architecture (in Fermi). The core is no longer able to process the "useless" third float within each clock cycle, but the mode of calling does not change dramatically because of this.

3. "it really does not matter in the slightest WHERE the TMU is if the TMU does exactly the same ruddy thing, it just doesn't."
That's just silly. So, if a particular modular unit (say the rasterizer, or a texture unit, or a geometry processor) is tied to a each SM, meaning that it processes data from, at most, 32 processing cores, you think it works in exactly the same way as if it's placed at global level? If nothing else, the way it connects to the other components will have changed (different bandwidth requirements, potentially with different data types / vector sizes transferred).

4. He conveniently does not address the global L2 cache, the parallel thread-kernel execution capabilities, or the formation of a new data heirachy (the division of the GPU into four "GPCs", linked only via the L2 cache). These are the most obvious 'high-level' changes.

... he is correct about one thing though, the division of the thread dispatcher occurred in Barts, not in Cayman. But this does not change any of the points I have made.

Finally, I point out once again that you need only read an analysis piece by a decent technical site (like anandtech or similar) to see how they view the changes that took place from GT200 to GF100 in comparison to those between generations of AMD products. I know that you think that you know better than all these people. But really, you don't. I also know you would like to believe that no "real development" went into GF100, since that would paint nvidia in an even worse light. But that is not the case.

------ Okay, here is GPU design 101 written in simple terms. Might be worth a read guys, even if you ignore the above -----

When you redesign a GPU, there are any number of adjustments you can make. In slightly over-simplified terms, you could group them into three catagories:

a) "Here and now" performance improvements. If your architecture still has "room to breathe" then you can simply add more processing / texturing power, and tweak the various sub-components.

b) To add new features. This will usually involve rearranging and expanding core-logic, and/or inserting new processing blocks. We saw this recently with tessellation: AMD added a single global tessellator that performed its own independent computations, whereas nvidia added a geometry setup unit to each SM (16 in total for Fermi), which uses the shader cores to perform tessellation.

c) To improve scalability: i.e. to allow you to perform (a) in the future without getting diminishing returns. This generally involves breaking down the existing sub-units into smaller, more modular versions, and moving them closer to the core.

- Operation (a) is the most preferable. You gain performance without dramatic changes to your architecture. That AMD have been able to perform largely these types of operation since r600 is a testament to the quality of that architecture and pipeline design.
- Operation (b) is, obviously, performed to include new features as and when required (for DX updates or for GPGPU functionality etc).
- Operation (c) is generally the most complex, as it requires a re-working of the entire GPU (from top-to-bottom of the pipeline), which requires changes in how each individual component works. This is not something you can afford to do every generation, and so it is always done with "one eye on the next few generations", i.e. with the idea to create something where only operation (a) is required for the next couple of generations.

Fermi was a type (c) update. As was G80 (the 8800GTX), and r600 (2900xt). All other updates since then have been largely of type (a) or (b). The switch to a 4D shader is a change to the base architecture, but as it does not affect the flow of the pipeline it does not require "a complete rework of the GPU". I think we will see further "type c" changes with the next AMD generation, though I suspect they will make these adjustments more gradually, having learned from nvidia (with Fermi) and themselves (with r600) that these massive rewrites don't always go as planned.

cainer · 13 Dec 2010 at 16:57

DM's wall is gonna be fkn hooge after that "summary"

Duff-Man · 13 Dec 2010 at 17:06

cainer said:
DM's wall is gonna be fkn hooge after that "summary"

Yeah... Well, I've made all the points that need making so he can write all he wants. I don't have infinite time to go over the same points.

Anyway, only the first few lines were a summary. The next 'block' is a counter to some of the "points" made. The final section is a more general description of the types of development that will be made when revising a GPU.

That last bit might be well worth a read - you might learn something

The rest can be summarised as "DM thinks that Fermi was a minor rewrite, everyone else in the world thinks otherwise"...

[TW]Sponge · 13 Dec 2010 at 18:47

Think my mouse pointer might have a tough time staying away from the GTX480 @ £250 button!!

Buckster · 13 Dec 2010 at 19:00

me too those 480s looking real tempting - does heat and noise really matter ?

I assume they are quiet when idle ?

Frame352 · 13 Dec 2010 at 19:00

sponge999 said:
Think my mouse pointer might have a tough time staying away from the GTX480 @ £250 button!!

That's such a tempting deal... I just don't have the cash though :rolleyes: