• Competitor rules

    Please remember that any mention of competitors, hinting at competitors or offering to provide details of competitors will result in an account suspension. The full rules can be found under the 'Terms and Rules' link in the bottom right corner of your screen. Just don't mention competitors in any way, shape or form and you'll be OK.

AMD to Start Production of Desktop "Bulldozer" Microprocessors in April.

Looks like it's going to come too late for me so I'll be jumping back in with the dark side, Sandybridge ;) All being well they won't be too stupid on price with Bulldozer's impending arrival...



Edit: typo!
 
AVX will not be that good on Bulldozer. It was tucked on really. AMD wanted to implement SSE5 as far as i know but that did not fly.

Sandy bridge will be better at AVX. Anyway i hope that they are going to be competitive, i really do. I would love my next rig to be an AMD one.
 
Yup can wait no longer sadly, just want the Core i7-2 to launch and im done.... maybe by 2012 AMD/Intel 8 cores will be the norm and very powerful :)
 
those cores will be packed into four modules. Every module which will have two independent integer cores (that will share fetch, decode and L2 functionality) with dedicated schedulers, one "Flex FP" floating point unit with two 128-bit FMAC pipes with one FP scheduler.

That bit above sounds like MIMD over multiple cores - in this case 2 cores. The reason I think this is because of the single fetch / decode as this is where the scheduler will be working.

If that's the case, one OS/application thread will run over 2 cores, so these chips will speed up all software, including those only written for single core.
 
Last edited:
AVX will not be that good on Bulldozer. It was tucked on really. AMD wanted to implement SSE5 as far as i know but that did not fly.

Sandy bridge will be better at AVX. Anyway i hope that they are going to be competitive, i really do. I would love my next rig to be an AMD one.

AVX isn't tacked on, you can't tack on something like that and, a 256bit fpu is a 256bit fpu. Intel's FPU improvements have several limitations aswell. Avx will depend largely on people programming to wrap up their fpu code together, which I can't really see a reason for them not to do.

those cores will be packed into four modules. Every module which will have two independent integer cores (that will share fetch, decode and L2 functionality) with dedicated schedulers, one "Flex FP" floating point unit with two 128-bit FMAC pipes with one FP scheduler.

That bit above sounds like MIMD over multiple cores - in this case 2 cores. The reason I think this is because of the single fetch / decode as this is where the scheduler will be working.

If that's the case, one OS/application thread will run over 2 cores, so these chips will speed up all software, including those only written for single core.

Thats simply not how it is, it will not split a workload over two threads, its only a module because, the FPU unit is a huge to thread 256bit FPU that can be used as 2 separate 128bit fpu units separately.

You've got this big FPU unit in the middle, and an interger unit on either site, the interger units can either use the whole 256bit FPU unit on its own for one clock, or the other interger unit can use it, or they can both share half of the fpu each(because the majority of the workload will be 32/64/128bit fpu code. As when it can push through an AVX piece of code and use the whole 256bit unit it can push through code that could take up to 8 clocks on a single 128bit unit, its still MUCH MUCH faster to share it for one clock than to wait one clock before using it.

Both interger cores are separate, the reason its a module is you can save a LOT of core logic by adding in a second very small interger unit into an existing module with minor modifications than have a separate interger unit, which would also require its own FPU unit, its own, everything.

The die size cost of adding a second interger unit and adding 99% interger performance, is 5%.

remember the future, for both AMD and Intel, is fusion, the 256bit fpu is a HUGE jump up on current FPU power, but in the further future the requirement for balanced FPU/Interger isn't there as gpu's get on die, it will almost always be quicker to run FPU stuff on the on die GPU.

At which point you'll be wanting to pair 2, or 4, or 8 interger cores per FPU core.

I've said before, bulldozer is going to be pretty insanely good, it will spank the crap out of a Phenom and be probably within 5% IPC of Sandybridge, which is just a massive gain, however, theres also a very very high potential of, due to die size saving of the second core in a module, and the pipeline design, higher speed parts than Intel can do, though in reality it will be probably on par clock speed, because they'll be behind on process for 18-24months.

The time when Bulldozer will become a truly amazing architecture, will be with the on die gpu added, my guess would be on the 22nm shrink.
 
From what little I've read, the module architecture is to reduce die size and power requirements, this is what your saying above.

However those 2 cores being served by one fetch/decode & scheduler appears to me as running single thread instructions over 2 cores. In previous thread I never said workload over two threads, as thats the reverse of what i'm saying.. Done using some 'OutOfOrder' logic to look down the single fetch, and the scheduler looking at what can be done in advance for second CPU. MIMD over multiple cores, or reserve hyperthreading others call it.

But as said I've not really looked into the design, and I could be totally wrong. I was just making a general observation.
 
Back
Top Bottom