AMD prepares three-core processors

NathanE · 18 Sep 2007 at 16:04

Mr Jack said:
Um. No. Really no. Multi-function threads are inelegant, inefficent and difficult to write.

Managing your own threads is tricky and shows negligable gains over letting the system do it; and it's far, far worse on a PC where you have an unknown number of other threads running. If you want multi-threaded goodness the proper way of doing it is have many different threads capable of running independently with few sync points. Then you throw them at the thread manager and let it pick the best way to divide the load. The problem with this approach is that it fits badly with the natural model for games where you have a large number of important sync points and interdependent systems.

Of course, on a PS3, for example you have complete hardware control and can, if you want, push the edges of the system by custom balancing each CPUs load - but, believe me, writing that kind of code is that exact opposite of elegant.

Not heard of IOCP then?

And no, "multi-function threads" (not heard that term before but I'll use it) are the future. The next generation game "Crysis" uses this design. It is highly elegant (when implemented properly with support from the OS - in NT this is called IOCP) and allows the game to continue scaling when more multi-cored processors arrive. Whereas a design that has hard coded itself to use only a specific number of threads that only do specific tasks will not scale at all. So you definately won't see any game engines that are intended to be OEM'd written in this way - as those engines are always written with long term scalability in mind.

Thread synchronisation (accessing shared data) is always a problem but it's no easier or more difficult in a fully elegant solution. In some ways it is easier IMO. Either way it's a mute point really as games these days use lock-free algorithms in their performance-critical code paths - U3 and Crysis certainly do anyway.

NathanE · 18 Sep 2007 at 16:11

stickroad said:
And is their currently any software avaliable that does this Nathan?

MS SQL Server and IIS web server

A lot of .NET software uses it too - often without even realising, simply because .NET uses it internally whenever you use ThreadPool.QueueUserWorkItem() amongst other things.

Crysis uses it too and so does the U3 engine to a slightly lesser extent. U3 hasn't quite transitioned to a fully elegant solution. They have converted the core engine to it but they have retained "worker threads" in some areas as well. Definately a migration path in place though.

Also the Xbox 360 SDK pushes developers down the 3 threads and a work job packet architecture route. Not forced but most of the advanced SDK documentation assumes that design. The Xbox 360 as you probably know runs a stripped Windows NT kernel that supports IOCP.

melbourne720 · 18 Sep 2007 at 16:15

I have mixed feeling about this. I agree with NathanE's point about it making good economic sense with failed quads. On the other hand I was hoping that the number of cores in CPUs was going to go up exponentially! 1,2,4,8,16,32...

cavemanoc · 18 Sep 2007 at 16:46

They will - this is just a price point thing - think x800 gto2 - mmmmm - free unlock - mmmmmmm

NathanE · 18 Sep 2007 at 16:51

Yup this is just a marketing & economics thing. I wouldn't let it bother you about the future of multi-core

I suspect most of these chips will be heading to OEM's anyway simply because it's they who will be able to shift them to average joes who like the sound of "tri-core"

Mr Jack · 18 Sep 2007 at 17:05

NathanE said:
Not heard of IOCP then?

I'm not sure what you mean by IOCP, the only meaning of IOCP (I/O Completion Ports) I can find that uses it is not really doing what you are describing. Could you be more precise?

The next generation game "Crysis" uses this design. It is highly elegant (when implemented properly with support from the OS - in NT this is called IOCP) and allows the game to continue scaling when more multi-cored processors arrive.

You cannot write software to run on an unspecified number of threads. Or, at least, not using any fast, or popular, programming language. You can go to full-on parallel programming languages but even they are of limited scalability, but they'd produce poor performance on any existing hardware so I find it unlikely that anyone is currently using them.

Whereas a design that has hard coded itself to use only a specific number of threads that only do specific tasks will not scale at all. So you definately won't see any game engines that are intended to be OEM'd written in this way - as those engines are always written with long term scalability in mind.

I doubt very much indeed you will see any game engine currently in development designed to be massively parallel. There's no point.

Thread synchronisation (accessing shared data) is always a problem but it's no easier or more difficult in a fully elegant solution. In some ways it is easier IMO. Either way it's a mute point really as games these days use lock-free algorithms in their performance-critical code paths - U3 and Crysis certainly do anyway.

The more threads you have the more complicated it is. Single function threads allow to work lock free (i.e. fast) except at known transfer points. So, for example, you can kick the physics data for a frame over to a physics thread, let it do it's stuff on another processor and then pick up the data when it complete and pass it over to a rendering thread.

NathanE · 18 Sep 2007 at 17:45

Mr Jack said:
I'm not sure what you mean by IOCP, the only meaning of IOCP (I/O Completion Ports) I can find that uses it is not really doing what you are describing. Could you be more precise?

http://www.microsoft.com/technet/sysinternals/information/IoCompletionPorts.mspx

You cannot write software to run on an unspecified number of threads.

So you are saying you cannot dynamically spawn a thread and then give it work on a dynamic basis? IOCP does exactly that

Hell you can even do that without IOCP... Any ThreadPool provider will offer such functionality that you say is impossible

Mr Jack said:
Or, at least, not using any fast, or popular, programming language. You can go to full-on parallel programming languages but even they are of limited scalability, but they'd produce poor performance on any existing hardware so I find it unlikely that anyone is currently using them.

IOCP can be used from C++ and .NET... both very fast and popular development platforms. IOCP is used in almost all server software that needs high throughput and like I say, increasingly, games are using it - or derivatives of it.

Mr Jack said:
I doubt very much indeed you will see any game engine currently in development designed to be massively parallel. There's no point.

Woah slow down here, where did "massively parallel" come from all of a sudden?! We are talking single PC scalability, not distributed computing here! There are plenty of upcoming game engines that use a truly scalable threading model - Crysis for example.

Mr Jack said:
The more threads you have the more complicated it is.

Yes that is the case if you use a broken threading model whereby you create a thread and then that thread is only allowed to do one type of work - e.g. graphics thread, sound thread, physics thread. That model is both unsustainable, unelegant and hard to maintain.

Mr Jack said:
Single function threads allow to work lock free (i.e. fast) except at known transfer points. So, for example, you can kick the physics data for a frame over to a physics thread, let it do it's stuff on another processor and then pick up the data when it complete and pass it over to a rendering thread.

Yes but see above that type of threading model is not sustainable and does not scale well. IOCP with it's concept of "work unit" packets that can be executed on any thread is the future.

A work unit is simply an abstraction. The idea is to make concrete classes of it, such as RenderFrame (or maybe even RenderObject for those really fine tuned game engines!) or PhysicsCalculation or SoundOutput etc. These work units are simply queued to the completion port and then one of the available threads can execute it given the instructions it contains.

Mr Jack · 18 Sep 2007 at 19:51

NathanE said:
So you are saying you cannot dynamically spawn a thread and then give it work on a dynamic basis? IOCP does exactly that Hell you can even do that without IOCP... Any ThreadPool provider will offer such functionality that you say is impossible

Sorry, I should have said: you cannot write software to run a specified amount of work on an unspecified number of threads. Games run a specified amount of work, that cannot be efficently split over an unspecified number of threads.

Any thread - at any given time - is only working on one thing. The allocation of existing threads to new tasks is not, in principle, any different from creating and destroying new threads - the reason IOCP does it that way is to manage uncertain load.

IOCP can be used from C++ and .NET... both very fast and popular development platforms. IOCP is used in almost all server software that needs high throughput and like I say, increasingly, games are using it - or derivatives of it.

Ok, game server's I'll give you

Woah slow down here, where did "massively parallel" come from all of a sudden?! We are talking single PC scalability, not distributed computing here! There are plenty of upcoming game engines that use a truly scalable threading model - Crysis for example.

Massively parallel and "truly scalable threading model" are the same thing; the only difference is in the load spreading. Threads are a form of simulated parallel programming (or, if they end up running on different cores, genuine parallel programming) - the techniques and challenges are very similar.

Yes that is the case if you use a broken threading model whereby you create a thread and then that thread is only allowed to do one type of work - e.g. graphics thread, sound thread, physics thread. That model is both unsustainable, unelegant and hard to maintain.

There's no basic difference between creating a new thread and re-using an old thread. You still end up with the same situation. Games contain few opportunities to have genuine seperability in function, and fewer where time is freely constrained and are thus poor candidates for seperability.

Yes but see above that type of threading model is not sustainable and does not scale well. IOCP with it's concept of "work unit" packets that can be executed on any thread is the future.

IOCP works because of the environment it works under - namely where work is requested on an effectively random basis by an external source - this is not the case under which games (game servers excepted) operate.

A work unit is simply an abstraction. The idea is to make concrete classes of it, such as RenderFrame (or maybe even RenderObject for those really fine tuned game engines!) or PhysicsCalculation or SoundOutput etc. These work units are simply queued to the completion port and then one of the available threads can execute it given the instructions it contains.

Which is all very well when you have seperable freely orderable objects. That is rarely the case in games. Render order is very important, more so when you throw in transparency. What's more delivering the output of that work into their destination is also time critical and order dependent.

NathanE · 18 Sep 2007 at 21:16

First of all I have to ask: Have you *ever* written multi-threaded code? And I don't mean the basic 'background worker thread' model, but a *real* industry-grade design. I don't mean to that to be offensive, it's just I get a real feeling from your posts that you've read up a bit about MT but never really done it in practice.

Sorry, I should have said: you cannot write software to run a specified amount of work on an unspecified number of threads. Games run a specified amount of work, that cannot be efficently split over an unspecified number of threads.

WTF?

Are you reading *anything* I write or just completely ignoring it? Again I don't mean to spark an argument here but you seem to be just spitting back what I write in my face without so much as an explanation.

Any thread - at any given time - is only working on one thing.

Said the Computer Science 101 student

Ok, game server's I'll give you

Nope, *full games* (not just their server side) are written with an ICOP "work unit" design these days. See Crysis, Eve Online etc.

Massively parallel and "truly scalable threading model" are the same thing; the only difference is in the load spreading. Threads are a form of simulated parallel programming (or, if they end up running on different cores, genuine parallel programming) - the techniques and challenges are very similar.

"only difference is the load spreading"? How on earth can you know that when those design names reveal nothing about their underlying implementation and finer design details?

There's no basic difference between creating a new thread and re-using an old thread. You still end up with the same situation. Games contain few opportunities to have genuine seperability in function, and fewer where time is freely constrained and are thus poor candidates for seperability.

Creating a thread is very expensive on NT. Very very expensive. It locks the thread scheduler and needs memory allocation, in short it's not something any application should do at any time other than startup.

I can think of plenty of opportunities where a game can gain parallelism advantages. Like before, one (lesser) game might just have a RenderFrame work unit. Another better game might have a more finer scale RenderObject work unit. That way it can dispatch several RenderObject work units and have them processed on multiple CPUs/cores and then when they are done it can fire off a RenderFrame work unit which takes those works and combines them into a single scene. There might also be a RenderEffect work unit which can render all the pixel shaders and such like. That is just breaking the ice as well, games are practically infinitely optimisable on a multi threaded platform with a good design underneath.

IOCP works because of the environment it works under - namely where work is requested on an effectively random basis by an external source - this is not the case under which games (game servers excepted) operate.

So 5 mins ago you didn't know what IOCP was and now you are educating me on it? :confused:

I would have a game's loading type if quite random. Not that it matters though. I'm not quite sure how the IOCP's type of load would be any different to what a lesser design has to do. It's just different means to an end - one of which (IOCP) is faster, more maintainable, more elegant and more scalable than the other.

Which is all very well when you have seperable freely orderable objects. That is rarely the case in games. Render order is very important, more so when you throw in transparency. What's more delivering the output of that work into their destination is also time critical and order dependent.

Any multi threaded design is inherently "out of order". IOCP is no exception. Luckily multi threaded programmers have plenty of ways to bring order back into the design. Lock-free queues being the prime example as used in server software for years and now being used in upcoming multi threaded games.

PhilGQ · 26 Sep 2007 at 13:49

From what I hear. The third core is for background stuff, like anti virus and all the other stuff people have running. Therefore releasing the first two cores for other tasks.

With so few apps able to use more than a single core. The advantage is in multitasking more than making one app go quicker, although this is nice when it is done.

Also stuff like virtualisation benefits. I have x2 3800 with 2GB ram, running vista, when I tried running two images of windows server in VMware, lets just say a third core could have been useful.

I think I will buy a dual socket MBoard next time and have 8core just to be on the safe side. When I save the money that is.

Stelly · 26 Sep 2007 at 13:59

Never mind

Stelly

UEX · 26 Sep 2007 at 21:37

I still think its unecessary

Competitor rules

AMD prepares three-core processors

More options

NathanE

NathanE

NathanE

NathanE

melbourne720

melbourne720

cavemanoc

cavemanoc

NathanE

NathanE

Mr Jack

Mr Jack

NathanE

NathanE

Mr Jack

Mr Jack

NathanE

NathanE

PhilGQ

PhilGQ

Stelly

Stelly

UEX

UEX