AMD to Start Production of Desktop "Bulldozer" Microprocessors in April.

Troezar · 7 Nov 2010 at 15:41

Looks like it's going to come too late for me so I'll be jumping back in with the dark side, Sandybridge

All being well they won't be too stupid on price with Bulldozer's impending arrival...

Edit: typo!

Trunks9486 · 7 Nov 2010 at 17:38

AVX will not be that good on Bulldozer. It was tucked on really. AMD wanted to implement SSE5 as far as i know but that did not fly.

Sandy bridge will be better at AVX. Anyway i hope that they are going to be competitive, i really do. I would love my next rig to be an AMD one.

HighlandeR · 8 Nov 2010 at 00:08

Yup can wait no longer sadly, just want the Core i7-2 to launch and im done.... maybe by 2012 AMD/Intel 8 cores will be the norm and very powerful

Admetos · 8 Nov 2010 at 01:32

those cores will be packed into four modules. Every module which will have two independent integer cores (that will share fetch, decode and L2 functionality) with dedicated schedulers, one "Flex FP" floating point unit with two 128-bit FMAC pipes with one FP scheduler.

That bit above sounds like MIMD over multiple cores - in this case 2 cores. The reason I think this is because of the single fetch / decode as this is where the scheduler will be working.

If that's the case, one OS/application thread will run over 2 cores, so these chips will speed up all software, including those only written for single core.

drunkenmaster · 8 Nov 2010 at 12:42

Trunks9486 said:
AVX will not be that good on Bulldozer. It was tucked on really. AMD wanted to implement SSE5 as far as i know but that did not fly.

Sandy bridge will be better at AVX. Anyway i hope that they are going to be competitive, i really do. I would love my next rig to be an AMD one.

AVX isn't tacked on, you can't tack on something like that and, a 256bit fpu is a 256bit fpu. Intel's FPU improvements have several limitations aswell. Avx will depend largely on people programming to wrap up their fpu code together, which I can't really see a reason for them not to do.

JasonM said:
those cores will be packed into four modules. Every module which will have two independent integer cores (that will share fetch, decode and L2 functionality) with dedicated schedulers, one "Flex FP" floating point unit with two 128-bit FMAC pipes with one FP scheduler.

That bit above sounds like MIMD over multiple cores - in this case 2 cores. The reason I think this is because of the single fetch / decode as this is where the scheduler will be working.

If that's the case, one OS/application thread will run over 2 cores, so these chips will speed up all software, including those only written for single core.

Thats simply not how it is, it will not split a workload over two threads, its only a module because, the FPU unit is a huge to thread 256bit FPU that can be used as 2 separate 128bit fpu units separately.

You've got this big FPU unit in the middle, and an interger unit on either site, the interger units can either use the whole 256bit FPU unit on its own for one clock, or the other interger unit can use it, or they can both share half of the fpu each(because the majority of the workload will be 32/64/128bit fpu code. As when it can push through an AVX piece of code and use the whole 256bit unit it can push through code that could take up to 8 clocks on a single 128bit unit, its still MUCH MUCH faster to share it for one clock than to wait one clock before using it.

Both interger cores are separate, the reason its a module is you can save a LOT of core logic by adding in a second very small interger unit into an existing module with minor modifications than have a separate interger unit, which would also require its own FPU unit, its own, everything.

The die size cost of adding a second interger unit and adding 99% interger performance, is 5%.

remember the future, for both AMD and Intel, is fusion, the 256bit fpu is a HUGE jump up on current FPU power, but in the further future the requirement for balanced FPU/Interger isn't there as gpu's get on die, it will almost always be quicker to run FPU stuff on the on die GPU.

At which point you'll be wanting to pair 2, or 4, or 8 interger cores per FPU core.

I've said before, bulldozer is going to be pretty insanely good, it will spank the crap out of a Phenom and be probably within 5% IPC of Sandybridge, which is just a massive gain, however, theres also a very very high potential of, due to die size saving of the second core in a module, and the pipeline design, higher speed parts than Intel can do, though in reality it will be probably on par clock speed, because they'll be behind on process for 18-24months.

The time when Bulldozer will become a truly amazing architecture, will be with the on die gpu added, my guess would be on the 22nm shrink.

Admetos · 8 Nov 2010 at 17:51

From what little I've read, the module architecture is to reduce die size and power requirements, this is what your saying above.

However those 2 cores being served by one fetch/decode & scheduler appears to me as running single thread instructions over 2 cores. In previous thread I never said workload over two threads, as thats the reverse of what i'm saying.. Done using some 'OutOfOrder' logic to look down the single fetch, and the scheduler looking at what can be done in advance for second CPU. MIMD over multiple cores, or reserve hyperthreading others call it.

But as said I've not really looked into the design, and I could be totally wrong. I was just making a general observation.

DragonQ · 8 Nov 2010 at 18:10

No triple channel DDR3? :/

paulc25 · 8 Nov 2010 at 18:25

DragonQ said:
No triple channel DDR3? :/

Nope. Sandybridge is dual channel also.

Competitor rules

AMD to Start Production of Desktop "Bulldozer" Microprocessors in April.

More options

Troezar

Troezar

Trunks9486

Trunks9486

HighlandeR

HighlandeR

Admetos

Admetos

drunkenmaster

drunkenmaster

Admetos

Admetos

DragonQ

DragonQ

paulc25

paulc25