• Competitor rules

    Please remember that any mention of competitors, hinting at competitors or offering to provide details of competitors will result in an account suspension. The full rules can be found under the 'Terms and Rules' link in the bottom right corner of your screen. Just don't mention competitors in any way, shape or form and you'll be OK.

Could IBM be the downfall of AMD?

NathanE said:
AMD64 (the instruction set) is far from elegant I'm afraid. I was speaking entirely about the instruction set and not their implementation of it (e.g. the Althon 64 and its on-die memory controller and HTT, like you mentioned :))

Yes I was merely referring to AMD's hardware design which I think is very good. Any company who can build a fully compatible product which does more for less (i.e. clock cycles) deserves their brownie points. And in the server world with Operton it scales too, due to Hypertransport. Expect to see 8 socket dual core Opteron boxes soon (16 cores). I don't pretend to understand the details of the instruction set, I imagine the 64-bit facility is just a 'tack on' to the original microcode so I take you point about the code - keeps the compiler writers on their toes if nothing else ;)
 
monaco87 said:
Unfortunatley with x86 we are stuck with an architecure that goes back to 1981. Oustide of the PC world there are some eye-popping developments, e.g. Sun's UltraSPARC T1 has 8 cores with each core handling 4 threads. Benchmarks against a 3.6GHz Xeon server show the Sun box averaging at 4 to 6 times the performance of the Xeon box The Sun CPU is clocked at 1GHz and uses only 73W !!!

:eek: :eek: :eek: What programs and OSs would be able to run with these CPUs.
Only if the PC market went down this route :(

How much do they cost out of interest?
 
monaco87 said:
Unfortunatley with x86 we are stuck with an architecure that goes back to 1981. Oustide of the PC world there are some eye-popping developments, e.g. Sun's UltraSPARC T1 has 8 cores with each core handling 4 threads. Benchmarks against a 3.6GHz Xeon server show the Sun box averaging at 4 to 6 times the performance of the Xeon box The Sun CPU is clocked at 1GHz and uses only 73W !!!

Thats all well and good... if the software is designed for that one single CPU. In the PC market it would flop since code written for another CPU would run very slow on this CPU.

Thats why all the x86 CPUs now have out of order execution and numerous hidden registers - essentially making up for lost performance due to 'generic' code produced from compilers. Without x86 and these weird and wacky, and relatively unknown features in both Athlons and Pentiums, performance in applications would be very, very fast on one CPU (ie. Pentium) and very, very slow on another CPU (ie. Athlon).

Its like the new X-Fi having the performance of a 3.6Ghz P4 in audio applications. The software is designed for the hardware, and in this case, vice versa. The X-Fi would barely run a Win32 version of Space Invaders.
 
Those new Sun sparc cpus are fantastic, they really are super quick.

They don't always out-and-out beat say, the Power 5, but on power per what they are ridiculously quick. Also run very cool, massively parallel. There's a good article on Anandtech about it.
 
yeah but didnt they say something like that would suck on a desktop? its been designed specifically for server like executions, but for games / every day use it wouldnt be all that and a bag of chips?
 
Boogle said:
Thats all well and good... if the software is designed for that one single CPU. In the PC market it would flop since code written for another CPU would run very slow on this CPU.

Thats why all the x86 CPUs now have out of order execution and numerous hidden registers - essentially making up for lost performance due to 'generic' code produced from compilers. Without x86 and these weird and wacky, and relatively unknown features in both Athlons and Pentiums, performance in applications would be very, very fast on one CPU (ie. Pentium) and very, very slow on another CPU (ie. Athlon).

Its like the new X-Fi having the performance of a 3.6Ghz P4 in audio applications. The software is designed for the hardware, and in this case, vice versa. The X-Fi would barely run a Win32 version of Space Invaders.

All cpu designers now accept that clock speed is not the way to better performance, parallelism and threading is the model. This will require a change in software design too. It's ironic that the Windows world is migrating ever so slowly to the way Unix has done things for years, i.e. 64 bit, high speed interconnects, highly threaded OS and applications etc. The new Sun chip just shows what CAN be done when the software and hardware are well matched. I mean this Sun thing is in a £5000 to £7000 box, this is not expensive stuff in the server world.

I would even go as far as to say that in the processor market there are now only two companies left who are really innovating at all - IBM and Sun. Intel and AMD are really only adding technology in a piecemeal way to an already out of date design.

By the way this is turning into a really good thread ... !
 
monaco87 said:
It's ironic that the Windows world is migrating ever so slowly to the way Unix has done things for years, i.e. 64 bit, high speed interconnects, highly threaded OS and applications etc.
I think that's a bit mixed up. It's more that the desktop PC industry is migrating to technologies that are pioneered by the supercomputer industry. This is nothing new. It has been this way since forever really.

At one time Unix didn't support multithreading at all. And it still isn't what I'd call a "highly" threaded OS. If you want an example of that, look at Windows. A typical Windows PC will have around 500 threads! Most Unix variants have a set number of background processes, and most of these do not fork() off. Sure there's things like Apache that do but the vast majority of basic software doesn't - which limits its ability to scale.

monaco87 said:
I would even go as far as to say that in the processor market there are now only two companies left who are really innovating at all - IBM and Sun. Intel and AMD are really only adding technology in a piecemeal way to an already out of date design.
IMO Intel's Itanium was innovation. See: http://en.wikipedia.org/wiki/Explicitly_Parallel_Instruction_Computing
 
NathanE said:
I think that's a bit mixed up. It's more that the desktop PC industry is migrating to technologies that are pioneered by the supercomputer industry. This is nothing new. It has been this way since forever really.

At one time Unix didn't support multithreading at all. And it still isn't what I'd call a "highly" threaded OS. If you want an example of that, look at Windows. A typical Windows PC will have around 500 threads! Most Unix variants have a set number of background processes, and most of these do not fork() off. Sure there's things like Apache that do but the vast majority of basic software doesn't - which limits its ability to scale.


IMO Intel's Itanium was innovation. See: http://en.wikipedia.org/wiki/Explicitly_Parallel_Instruction_Computing

No offence but I'm afraid you show a disntinct lack of knowledge of the modern Unix. We are not talking about supercomputing, this works on £1000 workstations. When you are looking at Windows "processes" that's mostly what you are seeing processes. Windows threading is extremely immature compared to say Solaris or any other Unix (even Linux). That's why for example most windows servers top out at around 4-8 cpu's, after that forget it. Whereas the largest Solaris system you will find is 72 dual core cpu's i.e 144 cores. You can only scale to that level with efficient use of threads. Also it's the same Solaris CD and applications that installs on the £1000 single cpu workstation as installs on the 144 core beast, because the OS scales to. There is no need for a 'Home', 'Professional', or 'Server' version, just one OS.

With Windows when you look at processes you are basically seing threads too, it's a one to one mapping. If I look at the process listing on my Solaris workstation I see processes too, but If I look at even something simple like the browser process say (Mozilla in my case) that one process is using 6 threads at idle! So if I use Solaris on an AMD X2 I don't need two apps to get the best, single apps run across both cores. So if you think "fork" is all there is to Unix you have a lot of catching up to do.

The new 8 core Sun system (with 4 concurrent threads per core, i.e. 32 threads) I mentioned run's all curent Solaris apps un-modified but faster, now that's backwards compatible. How - because the discipline in Unix apps is threads always threads (also known as LWP or LightWeight Processes).

Until Windows and it's applications use the same model, all these multi-core cpu's will be mostly wasted except for a few apps which explicitly target multi-core or if you kick off multiple tasks.
 
Last edited:
monaco87 said:
No offence but I'm afraid you show a disntinct lack of knowledge of the modern Unix.
Say's who? :)

monaco87 said:
We are not talking about supercomputing, this works on £1000 workstations. When you are looking at Windows "processes" that's mostly what you are seeing processes.
I know it applies to workstations too. When you look at a basic ps list on a Unix system - that's all you are seeing too. I don't really see the point you are trying to make here?

monaco87 said:
Windows threading is extremely immature compared to say Solaris or any other Unix (even Linux).
No it's superior actually. Windows is built off of VMS which was reknowned in its day for having vastly superior virtual memory and multi-threading support than Unix. Windows also has Completion Ports for server applications so that context switching can be minimised, which Unix still does not have. APC (asynchronous procedure call) is still a rarity in Unix variants. NT has had both of these essential threading tools at developer's disposal since day one.

monaco87 said:
That's why for example most windows servers top out at around 4-8 cpu's, after that forget it.
They don't top out at 8 CPUs though? Where did you hear this?

monaco87 said:
Whereas the largest Solaris system you will find is 72 dual core cpu's i.e 144 cores. You can only scale to that level with efficient use of threads.
I'd argue it's more down to the software you are running than the OS that you are using. An OS doesn't really care how many CPU's you've got - its scheduling behaviour will go largely unchanged, regardless. SQL Server could be a good example and this has been known to run on supercomputers of that at least that number of CPUs/cores. It scales too - because it uses IOCP.

monaco87 said:
Also it's the same Solaris CD and applications that installs on the £1000 single cpu workstation as installs on the 144 core beast, because the OS scales to. There is no need for a 'Home', 'Professional', or 'Server' version, just one OS.
I don't see how this is relevant. So what if Solaris has a single CD image? Microsoft likes to keep stricter controls on licensing. I think you may be getting confused and believe that there are seperate Windows kernels for different types of systems (e.g. one kernel for <8 CPUs and then a whole different kernel for 64 CPUs.) Well this is not the case, thankfully :) Microsoft use conditional compilation to achieve this effect.

monaco87 said:
With Windows when you look at processes you are basically seing threads too, it's a one to one mapping. If I look at the process listing on my Solaris workstation I see processes too, but If I look at even something simple like the browser process say (Mozilla in my case) that one process is using 6 threads at idle!
One to one mapping? 'fraid not :)

See screenshot:
threads.gif


monaco87 said:
So if I use Solaris on an AMD X2 I don't need two apps to get the best, single apps run across both cores. So if you think "fork" is all there is to Unix you have a lot of catching up to do.
Yeah this happens on Windows too, just like on any OS worth calling an OS :p Seriously, it's nothing amazing.

fork() was just an example by the way... There is of course POSIX threads. And most *nix variants created their own distinct ways to do threading. It still doesn't defeat the fact that Unix wasn't designed to support threads.

monaco87 said:
The new 8 core Sun system (with 4 concurrent threads per core, i.e. 32 threads) I mentioned run's all curent Solaris apps un-modified but faster, now that's backwards compatible. How - because the discipline in Unix apps is threads always threads (also known as LWP or LightWeight Processes).
I'm not really sure what you're getting at here? If you plugged an equivilent CPU into a Windows server you'd get exactly the same performance increase - probably more.

Also if you want to get into lightweight threading. Well Windows has Fibers which are an extremely basic thread. It gives the programmer the control to perform his own scheduling among fibers. I'd guess these will become very useful for game development on Windows in a year or two.

monaco87 said:
Until Windows and it's applications use the same model, all these multi-core cpu's will be mostly wasted except for a few apps which explicitly target multi-core or if you kick off multiple tasks.
Windows uses a superior model so it doesn't need to use the same model. Multi-core is actually a major coup for Microsoft because it is going to unlock all the hard work they put into the Windows NT kernel all those years ago (thanks Dave Cutler, we owe ya!) Windows NT was _designed_ for SMP operation from the ground up. Unix wasn't.



I realise you are going to throw this all back at me now. So let's leave it there. I'm sure you're not going to be easily swayed on this subject, but neither am I.
 
Last edited:
NathanE said:
Say's who? :)


Windows also has Completion Ports for server applications so that context switching can be minimised, which Unix still does not have.

Actually on the new 8 core chip context switching has been largely eliminated Each core is dolled out 4 threads. Then the core manages those threads locally using multiple register banks. When one threads stalls on a memory access, the core switches to the next one.


They don't top out at 8 CPUs though? Where did you hear this?
No it is possible to run Windows on 8+ cpus, but the law of diminishing returns applies. Above about 8 each additional cpu scales by about 0.6 on a Unisys system. So for every £100 you might as well bin £40

I'd argue it's more down to the software you are running than the OS that you are using. An OS doesn't really care how many CPU's you've got - its scheduling behaviour will go largely unchanged, regardless. SQL Server could be a good example and this has been known to run on supercomputers of that at least that number of CPUs/cores. It scales too - because it uses IOCP.
Wrong, on Solaris the sheduler changes behaviour dependent on the type of cpus. On systems with multi core or a mix of single and multi, it localises threads on multicore so that threads may avail of shared local caahe.

I don't see how this is relevant. So what if Solaris has a single CD image? Microsoft likes to keep stricter controls on licensing. I think you may be getting confused and believe that there are seperate Windows kernels for different types of systems (e.g. one kernel for <8 CPUs and then a whole different kernel for 64 CPUs.) Well this is not the case, thankfully :) Microsoft use conditional compilation to achieve this effect.

Errr.. conditional compliation means different bits get compiled... i.e. different kernels

I'm not really sure what you're getting at here? If you plugged an equivilent CPU into a Windows server you'd get exactly the same performance increase - probably more.

I don't think so. We are nearly ten years down the line from NT 3.51 and the DELL's, HP's of this world still can't make systems over 8 cpus work well. That should tell you something.

Windows uses a superior model so it doesn't need to use the same model. Multi-core is actually a major coup for Microsoft because it is going to unlock all the hard work they put into the Windows NT kernel all those years ago (thanks Dave Cutler, we owe ya!) Windows NT was _designed_ for SMP operation from the ground up. Unix wasn't.

See point above. Multi core is really no different from multi cpu. If this is the case where have all the 32/64/128 cpu Windows boxes been for the last 10 years. The first Solaris SMP kernel was 1992 by the way designed from the ground up. Has been ever since.


I realise you are going to throw this all back at me now. So let's leave it there. I'm sure you're not going to be easily swayed on this subject, but neither am I.

This is not really a debate about Windows vs Unix. The fact is however if the Windows OS & apps were designed along the Unix lines then the users on this board could drop in the multi core and bang.. extra performance. Fact is they can't except on a select few apps. I thought about dual core myself for media work. I asked on here which apps used the two cores, I got 3 app names. Having multiple threads is not the answer when you are developing for single cpu systems. Designing for multi-threaded environments is... and those evironments are heading towards lots of cores, multiple threads per core, very short piplenes (If a pipeline stalls, who cares, just switch to the next thread while you re-fetch).
 
monaco87 said:
Actually on the new 8 core chip context switching has been largely eliminated Each core is dolled out 4 threads. Then the core manages those threads locally using multiple register banks. When one threads stalls on a memory access, the core switches to the next one.
Yeah that's all very well until you have more runnable threads than the CPU can naturally handle.

monaco87 said:
No it is possible to run Windows on 8+ cpus, but the law of diminishing returns applies. Above about 8 each additional cpu scales by about 0.6 on a Unisys system. So for every £100 you might as well bin £40
Not true. Totally not true. Windows scales just as well (if not better) than an equivilent Unix OS. The law of dimishing returns (e.g. scalability) kicks in with the software you are using, not so much with the OS itself. This is a host software design problem. It's about how much concurrency you can squeeze out of your code by reducing lock times and improving synchronisation algorithms. It has little, almost nothing, to do with the OS. The only part where scalability really comes into it with the OS itself is with scheduling. There's only so much an OS can do in this regard. Most of it boils down to quantum tweaking, dynamic priority boosting and reducing locking of memory structures associated with the scheduler. Recognising multi-core, hyperthreading and NUMA systems and changing the scheduling behaviour accordingly is also of course involved - but these are trivial matters in comparison.

monaco87 said:
Wrong, on Solaris the sheduler changes behaviour dependent on the type of cpus. On systems with multi core or a mix of single and multi, it localises threads on multicore so that threads may avail of shared local caahe.
There are subtle changes on Windows too - but nothing major to speak of. Windows also tries to keep threads localised to avoid cache flushing (performance hit). This is OS Design 101. Nothing spectacular to speak of. Hyperthreading, multi-core and NUMA systems are all nicely accounted for by Windows' scheduler.

monaco87 said:
Errr.. conditional compliation means different bits get compiled... i.e. different kernels
Well yeah but its still the same code base.

monaco87 said:
I don't think so. We are nearly ten years down the line from NT 3.51 and the DELL's, HP's of this world still can't make systems over 8 cpus work well. That should tell you something.
It's not Microsoft's fault if the hardware isn't available. Windows is tied to x86 and x86 has only just recently jumping on the multi-core bandwagon. Microsoft got its hands on some 64 proc x86 hardware and as such Windows Server 2003 is validated to work on such a system.

monaco87 said:
See point above. Multi core is really no different from multi cpu. If this is the case where have all the 32/64/128 cpu Windows boxes been for the last 10 years. The first Solaris SMP kernel was 1992 by the way designed from the ground up. Has been ever since.
I didn't say multi-core was different to multiple CPUs? Where have 128 CPU x86 boxes been for the past 10 years? I don't know :) Ask Intel or AMD :p As I was saying - now that multi-core is kicking off in the x86 world it is going to be a big coup for Microsoft because finally the power of SMP in the Windows kernel will be unlocked for all users, even home desktop users, to see.

I think you are in for a bit a shock ;)

monaco87 said:
This is not really a debate about Windows vs Unix. The fact is however if the Windows OS & apps were designed along the Unix lines then the users on this board could drop in the multi core and bang.. extra performance. Fact is they can't except on a select few apps. I thought about dual core myself for media work. I asked on here which apps used the two cores, I got 3 app names. Having multiple threads is not the answer when you are developing for single cpu systems. Designing for multi-threaded environments is... and those evironments are heading towards lots of cores, multiple threads per core, very short piplenes (If a pipeline stalls, who cares, just switch to the next thread while you re-fetch).
You're right it's not a debate of OSes. But I get the distinct impression you're blaming Windows for shortcomings that aren't its fault. You can't blame the software just because the hardware is rubbish in comparison to the common stuff you'd see Solaris running on :) And you really need stop this sillyness about Windows not supporting SMP properly. If you drop multi-core CPU into a Windows system then you will get a performance boost straight away - regardless of the software you're using... just look around this very forum for first hand opinions on that.

It's unfair to compare a desktop OS and its host software to a server platform such as Solaris (or any Unix really). Being a solely server platform, pretty much all its server software is designed to be multi threaded. With Windows it is the same case.. SQL Server is highly MT, as is IIS, as is MySQL - basically anything server related you can bet will be MT. Media encoders have also been MT for a long long time now, on all OSes. However despite regular desktop software being MT (see Task Manager screenshot in previous post) they are not yet concurrent enough to see big gains. As an example: Firefox may use 6 threads, but I can guarantee one of those will for the GUI, one for pulling HTTP documents, one for rendering and the rest for background worker jobs. It uses the threads to make the GUI feel smooth, it doesn't use them to get more work done in the same amount of time as such - not in the way a server program would. This by the way is basic software design methodology - it has nothing to do with the OS. Almost every Windows application will have at least 2 threads - one for its GUI and one for everything else. A lot though will make use of more threads than that (again, see Task Manager screenshot.)

Do you see now why your comparison is unfair? :)
 
Last edited:
Goksly said:
yeah but didnt they say something like that would suck on a desktop? its been designed specifically for server like executions, but for games / every day use it wouldnt be all that and a bag of chips?


Indeed it would suck in the current Windows / software climate, its too parallel (as is my understanding anyway).

But its a pretty exciting platform! So little power used, while so much raw prcessing power is insane.
 
hogfather said:
Indeed it would suck in the current Windows / software climate, its too parallel (as is my understanding anyway).

Its too restrictive - the software has to be written for the CPU specifically (or at least with it in mind, and the compiler targetting that one CPU). If there were an architecture change, then performance on the new CPU would be far lower than you would expect.

In short, the CPU is nice in the market it is designed for. In the general PC market it wouldn't be as fast as an Athlon64 / Pentium4. In fact it would (if it supported x86) be nearer the level of a Via C3, albeit with 4 of them.
 
This is possibly one of the most articulate threads I have ever read on this site - well done guys !

(is it thread safe?)

The (business) reality from my perspecive is that my large enterprise customers used (past tense) 48cpu HP V class and Solaris E10K boxes. We also have a large customer using a 128 cpu Siemens box. The largest NT server I have encountered had 8 Pentium (1) processors in it.

These customers are now downsizing as processors become faster and I'm sure they will continue to do so as multi-core processors come on board. (lower footprint, less heat, reduced power and aircon requirements).

Wasn't Itanium born out of the fruits of Hp's aquisition of Compaq and their True64 processor?
 
gEd said:
The largest NT server I have encountered had 8 Pentium (1) processors in it.
That's because soon after 8 CPUs the price of x86 hardware really does shoot up fast. Prices start hitting mainframe and small size supercomputer territory.

Luckily, as you said, multi-core is taking off now and it will only be a couple years before there are fairly cheap x86 servers with 32 CPUs/cores or more.

x86 has only recently obtained NUMA and direct connection bus (e.g. Hypertransport) capabilities - so really its ability to scale well in the server world has only just been unlocked.

gEd said:
Wasn't Itanium born out of the fruits of Hp's aquisition of Compaq and their True64 processor?
Since it was Intel's first non-x86 design for decades and their first 64-bit CPU I'd guess they borrowed some ideas from that CPU. But the resultant architecture itself was completely different from True64.
 
Back
Top Bottom