• Competitor rules

    Please remember that any mention of competitors, hinting at competitors or offering to provide details of competitors will result in an account suspension. The full rules can be found under the 'Terms and Rules' link in the bottom right corner of your screen. Just don't mention competitors in any way, shape or form and you'll be OK.

The NVidia GV100 News Thread

Caporegime
Joined
18 Oct 2002
Posts
32,618
That's why I can't understand why all the Gamers were getting excited about it, We're never going to see a GTX Gaming card with anything close to this, Are we?
The gaming GPUs will be even better because they wont have the non-gaming support for FP64 and so on. The fact that the CUDA cores are now 50% more efficient will have a big impact, as will the new thread scheduling.
 
Soldato
Joined
19 Apr 2003
Posts
13,513
No but the tech advancement will likely bring other advantages with it that can be applied to GeForce and likewise the potential stats when stripping out the other stuff gives room for a lot of potential.
Agreed, very much like F1: areodynamics, KERS, active suspension, ABS etc... have all been adapted and variations adopted in everyday road cars over the years.

Just hope they adopt sooner rather than later.
 
Associate
Joined
28 Jan 2010
Posts
1,547
Location
Brighton
I was surprised how large and powerful the GV100 is to be honest, especially including Tensor cores.

Of course the GTX 2080 will not have the FP64 cores, and probably also not the Tensor cores exactly as-is, but I wonder if Nvidia will offer some mixed-precision stuff for FP16 and/or FP8 like AMD are doing with Vega.

AMD has shown you can use half-precision (FP16) for certain tasks without losing visual quality (like hair rendering).

It'll be interesting if we see large performance gains in game engines if they start utilising FP16/FP8 workloads for suitable tasks. Could this be a sign of the hardware starting to support that?
 
Caporegime
Joined
18 Oct 2002
Posts
32,618
I was surprised how large and powerful the GV100 is to be honest, especially including Tensor cores.

Of course the GTX 2080 will not have the FP64 cores, and probably also not the Tensor cores exactly as-is, but I wonder if Nvidia will offer some mixed-precision stuff for FP16 and/or FP8 like AMD are doing with Vega.

AMD has shown you can use half-precision (FP16) for certain tasks without losing visual quality (like hair rendering).

It'll be interesting if we see large performance gains in game engines if they start utilising FP16/FP8 workloads for suitable tasks. Could this be a sign of the hardware starting to support that?


Consumer Volta will most liekly have FP16 support. Pascal Gp100 already had it and it isn't clear if consumer pascal has it but its disabled in drivers or its missing entirely You can specify FP16 variables but there is not a 2x speed up, but potential you get the improvements of the smaller size with less register pressure and better cache coherence etc. It is always a trade-off though, it may just be better to spend the transistors on additional FP32 cores.
 
Soldato
Joined
24 Jun 2004
Posts
10,977
Location
Manchester
Would you care to explain this for numpties like me?

Okay - I'll give it a try!

Think of a matrix as like a 2D array of numbers; say N x N numbers arranged in a table.

Most scientific computing makes heavy use of "matrix - vector" multiplication, where the matrix is multiplied by a 1D array of numbers (a vector). For a matrix of size "N", this takes "N^2" floating point operations. If you want to make this sort of operation parallel, you are usually limited by how fast you can transfer data around the machine (i.e. by memory bandwidth). For these sorts of applications GPUs aren't so effective, because you saturate the memory bandwidth long before you max out the compute capability.

But, for certain scientific computing applications, you can formulate the problem in terms of matrix-matrix multiplications instead. Here you multiply one matrix (2D array) with another. This costs "N^3" operations, so you end up transferring a similar amount of data around, but need "N" times more computations. These are the sort of applications that benefit the most from GPUs, because you can unleash the full computing power of the GPU (... at least if your matrices are big enough, i.e. for large N).

Previously, once you had moved your two matrices into the GPU memory, you would use the standard FP32 or FP64 cores to do the calculation - that is, you set up a list of instructions for the CUDA cores to carry out in order to do the multiplication. With the "tensor cores" it seems that the *entire matrix-matrix multiplication* is done in hardware. So, the only thing you can send to tensor cores is a pair of matrices, but you will get the result much much faster than by going through the general purpose FP32 cores.


So, for any scientific application that relies heavily on matrix-matrix multiplications (where all / most of the numbers in the matrix are non-zero), could see a further ~10x speedup from this setup, on top of the (probably) 20 - 100x speedup they already see over using a CPU.

For applications like machine learning, or molecular mechanics simulation, you could see a very real 10x speedup (if Nvidias numbers are to be believed). For applications like finite element analysis, or computational fluid dynamics, it's not going to help at all.
 
Man of Honour
Joined
13 Oct 2006
Posts
91,119
Quick and dirty explanation is that matrix operations lets you process a load of numbers like an assembly line, you do a group at a time rather than go through each operation one by one.
 
Soldato
Joined
24 Jun 2004
Posts
10,977
Location
Manchester
Quick and dirty explanation is that matrix operations lets you process a load of numbers like an assembly line, you do a group at a time rather than go through each operation one by one.

Yes, pretty much. ... but only a very small subset of algorithms can be formulated in such a way as to make use of this.

Basically, for the applications that already make efficient use of GPUs, this will be an absolute godsend. For everyone else, it's unlikely to make any difference at all.
 
Soldato
Joined
24 Jun 2004
Posts
10,977
Location
Manchester
It'll be interesting if we see large performance gains in game engines if they start utilising FP16/FP8 workloads for suitable tasks. Could this be a sign of the hardware starting to support that?

I think so.

Mixed / half precision is quite a hot topic at the moment. As games get more complex there should be plenty of opportunity to drop various algorithms down to half precision. I imagine that having access to improved performance for FP16 / FP8 will be a lot more useful to game developers than (say) tensor cores or a big stack of FP64.
 
Associate
Joined
28 Jan 2010
Posts
1,547
Location
Brighton
I think so.

Mixed / half precision is quite a hot topic at the moment. As games get more complex there should be plenty of opportunity to drop various algorithms down to half precision. I imagine that having access to improved performance for FP16 / FP8 will be a lot more useful to game developers than (say) tensor cores or a big stack of FP64.

Could be interesting if 4K is cheap & easy to run as soon as 2018/2019 through a combo of hardware changes and mixed precision being adopted in games.
 
Soldato
Joined
24 Jun 2004
Posts
10,977
Location
Manchester
Very clear, thank you. Will this help games too?

Not really... This mostly comes into play for certain types of complex scientific simulation.

Perhaps it'll "unlock" certain algorithms for real-time implementation, allowing developers to try new things, but I can't think what off the top of my head. To be honest I'm not expecting the tensor cores to be present (or at least active) in the Geforece line.
 
Caporegime
Joined
18 Oct 2002
Posts
32,618
Not really... This mostly comes into play for certain types of complex scientific simulation.

Perhaps it'll "unlock" certain algorithms for real-time implementation, allowing developers to try new things, but I can't think what off the top of my head. To be honest I'm not expecting the tensor cores to be present (or at least active) in the Geforece line.


I doubt it in the short term. but you know what, I think there could be some amazing use of it in the future. Deep-learning is taking over so many fields, applying it to computer games could be the next big thing and then of course tensor cores would be perfect. There is obvious things like enemy AI but there are other tings where deep-learning could be used for graphical effects or understanding what the player is doing.
 
Soldato
Joined
24 Jun 2004
Posts
10,977
Location
Manchester
I doubt it in the short term. but you know what, I think there could be some amazing use of it in the future. Deep-learning is taking over so many fields, applying it to computer games could be the next big thing and then of course tensor cores would be perfect. There is obvious things like enemy AI but there are other tings where deep-learning could be used for graphical effects or understanding what the player is doing.

Hmm... "Deep learning" is generally a long, intricate process that's best suited to dealing with massive amounts of loosely-correlated data. Not really suitable for running in real-time in traditional "GPU heavy" applications like FPS or similar.

Could be very interesting for strategy games though I suppose... Here you wouldn't be constrained to doing updates at every frame, or keeping everything synchronised. The machine learning could essentially run as a background process, taking advantage of any unoccupied resources. I can imagine it being useful for something like an RTS or turn-based strategy. Could be used to adaptively develop enemy tactics based on your own moves for example.
 
Associate
Joined
28 Jan 2010
Posts
1,547
Location
Brighton
Hmm... "Deep learning" is generally a long, intricate process that's best suited to dealing with massive amounts of loosely-correlated data. Not really suitable for running in real-time in traditional "GPU heavy" applications like FPS or similar.

Could be very interesting for strategy games though I suppose... Here you wouldn't be constrained to doing updates at every frame, or keeping everything synchronised. The machine learning could essentially run as a background process, taking advantage of any unoccupied resources. I can imagine it being useful for something like an RTS or turn-based strategy. Could be used to adaptively develop enemy tactics based on your own moves for example.

You're thinking about the training part, not the utilisation part. You can have a (pre-trained) neural network identify whether it's looking at a cat, or construction worker, or bike, or house, etc. in a single picture. Just as an example.


I doubt it in the short term. but you know what, I think there could be some amazing use of it in the future. Deep-learning is taking over so many fields, applying it to computer games could be the next big thing and then of course tensor cores would be perfect. There is obvious things like enemy AI but there are other tings where deep-learning could be used for graphical effects or understanding what the player is doing.

It could be used for enemy AI that just makes smarter choices, and/or ACTUALLY learns from you as it plays you. The possibility of this should become clear when DeepMind play at the Starcraft Tournament later this year.

Also it could be used for some interesting productivity-boosting things, like procedural generation. Procedural generation at the moment is slightly bad/boring, partly because it works in a similar way to current enemy AI (which isn't really AI) technology. If a neural network could be trained to produce environments/buildings/etc. which were on a par with human hand-placed assets, THAT would be interesting.


So... buy Nvidia shares right? haha.

Look at what's happened to their price since Feb 2016 :eek:
 
Caporegime
Joined
18 Oct 2002
Posts
32,618
Hmm... "Deep learning" is generally a long, intricate process that's best suited to dealing with massive amounts of loosely-correlated data. Not really suitable for running in real-time in traditional "GPU heavy" applications like FPS or similar.

Could be very interesting for strategy games though I suppose... Here you wouldn't be constrained to doing updates at every frame, or keeping everything synchronised. The machine learning could essentially run as a background process, taking advantage of any unoccupied resources. I can imagine it being useful for something like an RTS or turn-based strategy. Could be used to adaptively develop enemy tactics based on your own moves for example.


As AllBodies pointed out, you seem to have mixed up the deep neural network training with the inference. Training is incredibly computationally expensive, can take weeks on a large server farm. The inference is relatively fast and the applications are typcially used in real-time. Whenever you talk your Android phone it uses deep-learning to do speech recognition. Deep-learning is at the core of all autonomous vehicle technologies right now, analyzing the entire environment around them form multiple 4K camera and LIDAR sensors with millions of data points, all analyzed at 20-100 hertz.
 
Back
Top Bottom