• Competitor rules

    Please remember that any mention of competitors, hinting at competitors or offering to provide details of competitors will result in an account suspension. The full rules can be found under the 'Terms and Rules' link in the bottom right corner of your screen. Just don't mention competitors in any way, shape or form and you'll be OK.

Conroe in july

killer_uk said:
What is a 'Merom'? :confused:

Mobile Conroe chip

"Mobile Platform Laptop users can take advantage of the increased multi-core compute capability within the mobile form factors.

Intel is developing a mobility-optimized, dual-core processor based on the new, state of the art, Intel Core microarchitecture, codenamed Merom. The Merom processor will work within the Intel® Centrino Duo® mobile technology-based platform and Merom is targeted for introduction to align with the 2006 holiday buying season."

See here for more details
 
i know what you mean, merom sounds just like yonah (dual core dothan i think on smaller 65nm process) people are raving about conroe, sounds like a bit of an over-glorified pentium M to me, since pentium Ms (dothans) were capable of beating athlon 64s at same or slightly slower speeds. whats conroe bringing to the table that doesn't already exist, and what are performance advantages over dothan and yonah, i seriously doubt it will be much of a performance boost. dothan was best core intel have came up with in a long time, as well before i finish isn't conroe based on dothans architectural design (not copied but based on) moderate clock speeds, high IPC?
 
Gashman said:
i know what you mean, merom sounds just like yonah (dual core dothan i think on smaller 65nm process) people are raving about conroe, sounds like a bit of an over-glorified pentium M to me, since pentium Ms (dothans) were capable of beating athlon 64s at same or slightly slower speeds. whats conroe bringing to the table that doesn't already exist, and what are performance advantages over dothan and yonah, i seriously doubt it will be much of a performance boost. dothan was best core intel have came up with in a long time, as well before i finish isn't conroe based on dothans architectural design (not copied but based on) moderate clock speeds, high IPC?

To be honest, that shows how little you understand CPU architecture....

The dothan core was actually little more than a P6 core (the pentium pro/PII/PIII) with a few minor changes in architecture (Seriously improved branch prediction, micro-op fusion and a couple of other tweaks) that were pretty much 'front end' tweaks, the actual execution unit architecture was pretty much unchanged from the PIII, only the way the instructions were fed in.

Link to a good architectural overview of the Dothan/Pentium-M/Banias core

By contrast the Core architecture that powers the conroe, Merom and woodcrest CPU's is new from the ground up (unlike Dothan), building on a combination of what has been learnt from Netburst (the P4 architecture), the old P6 architecture, lessons learnt from Dothan and generally a determination to build a wide, high IPC processor.

Article discussing the new 'core' architecture that is behind the three new chip categories

For those who haven't got time to read the articles, or perhaps find them a bit technical (which they can be), I'll post two diagrams to show the differences.

P6 Architecture (Pentium Pro/PII/PIII/Pentium-M)

p6-core.png


Core Architecture

core.gif


(Credit to Arstechnica for the diagrams)

Comparing the two, the core architecture is seriously beefed up in comparison to the P6.

For completeness, I can throw in a diagram from the A64 (Full article on the A64 here) to see where it differs from both.

opteron.png


That shows (along with various other changes) how the cores evolve, and why one is likely to be faster than another.

(For information about Netburst, have a read of these two articles here and here)

There are, obviously, front end changes as well (pipelines, scheduling, ordering, cache sizes and access methods etc etc) which are the mainstay of 'evolution' in CPU's rather than revolution, but fundamentally you can only push the architecture so far with front end tweaks before it falls down.

Conroe, and by extension, Merom, are both likely to be significantly quicker than Dothan, with the possible exception of the very early chips for the new architecture (much like the Pentium Pro was slower than the pentium in some circumstances, and the same with the P4 vs the P3 when it launched).
 
why does the K8 architecture there look so much more efficient than the other two, i mean look at it, its much less hastle, and accomplishes the same results with a much less complicated design (and please no crap about 'not knowing about processor architecture please' its just a bloody question, nothing more, nothing less)
 
Gashman said:
why does the K8 architecture there look so much more efficient than the other two, i mean look at it, its much less hastle, and accomplishes the same results with a much less complicated design (and please no crap about 'not knowing about processor architecture please' its just a bloody question, nothing more, nothing less)

Efficency isn't measured in putting in less features, but by how well those features can be utilised (one of the problems with netburst and the reason hyperthreading works so well on the P4, lots of the execution units sit idle a lot of the time). The K7/K8 is has more execution units than the P6, incidentally, just has them less spread out into specific functions.

The architecture also looks a bit simpler because the load/store unit (shown in yellow on the Intel articles) is not shown on the diagram for the K8 (it's often omitted for clarity in such diagrams), unfortunately I can't find one that has it in for the K8 (or the K7, which execution wise is the same). (Should have mentioned this in my previous post actually, but I forgot :rolleyes: )

Going back to my first point, the real issue is making sure the execution units are processing code. This is where the front end optimisation comes in. (Scheduling, branch prediction, OOE (out of order execution) units and so on).

If you have a 'narrow' processor (ie one with a few EU's that can only do a couple of instructions at once), Scheduling is fairly easy, you execute instructions in-order and as they come. The problem with this approach is that the only way to make it quicker is to increase the clockspeed.

pentium.png
(credit arstechnica again)

This is the original Pentium architecture, and it's incredibly simple really. One single floating point EU with a single pipeline and two integer EU's each with their own pipelines (that weren't actually identical, strangely. One was much more use with extra hardware than the other)

The Pentium didn't execute instructions out of order (apart from in a few very limited circumstances) it actually had a lot of hardware devoted to making x86 instructions work on the architecture (a workaround that's still done today, only much, much more efficiently)

The problem with increasing clockspeed is that you either have to increase the bus speed, the multipler, or both, it also generally requires more voltage and produces more heat.

This idea has long been discarded (with the P6 core, and most things that followed on from it including the K7 and K8 cores from AMD) in favour of being able to analyse instructions, split them all up and run them in the order that allows for most efficent use of the hardware available, then put the string back together afterwards. This is out of order execution, and was the key to the success of the P6 core and everything that came after it.

The secret to improving the processors these days (as Intel discovered with Netburst) is not going to be getting sillier and sillier clockspeeds, but in getting more instructions processed on each pass around.

This solution isn't without problems of it's own, however. It requires a lot of work in scheduling, prediction, dependancies and so on, and can backfire in a major way if you get this wrong. A large, wide processor will only be as efficent as it's front end, because if those EU's sit idle, then the processor will be slow.

The big difference between the K8 approach, and the Core approach is that AMD have simply gone with existing all purpose units, while Intel have beefed up the all purpose units, and added some more specific units (the Vector ones) in addition to those. In theory this means that the CPU can process more instructions, because the K8 does vector calculations in it's standard units, so can't be using those units for something else at the same time.

The maximum number of operations that can be completed in one clock cycle is determined by the number of EU's available, and the amount of free access per cycle (how many instructions can be passed on, because some share access).

Hope that explains it a bit...

More in depth details for what I'm talking about can be found here. It focuses on the evolution of the Intel pentium line, but it covers everything, and explains well, and generally AMD have followed, rather than led when it comes to innovation like this.
 
Thanks guys :)

Gashman, I hope you didn't take my comment at the start of my first post to be offensive, it wasn't meant to be :)

I must say that I've not paid much attention to most of this for the last couple of years, Conroe is the first thing that's really sparked my interest in a while (I'm actually still currently running a 2.4c northwood as my main rig), and that's a good thing in my mind.

I think I must be pretty sad at points, I read this sort of thing for fun :)
 
Back
Top Bottom