• Competitor rules

    Please remember that any mention of competitors, hinting at competitors or offering to provide details of competitors will result in an account suspension. The full rules can be found under the 'Terms and Rules' link in the bottom right corner of your screen. Just don't mention competitors in any way, shape or form and you'll be OK.

Nvidia GTC16 Webcast Live Stream@5pm/9am PDT *Jen looks great in leather.*

Soldato
Joined
17 Jun 2004
Posts
7,617
Location
Eastbourne , East Sussex.
D.P - skhynix say the best performance is from 8-Hi stack @ 256GB/s (giving 1TB/S for a full 32GB card) which is where the numbers in the presentation come from

4-Hi is 204.8GB/s bandwidth per chip , which is inverse of what you said (and also is the way SSD`s work - larger capacity being faster than smaller)

regarding 980Ti vs FuryX - now DX12 games are here , GCN is showing what it can do - which is why Nv are going to same way - 980Ti is a DX11 card which sadly cant do the big selling point of DX12 - Async compute.

@Rrof

I concede that HBM2 is larger than HBM1 but its not double

hbm2_mechanical.png
 
Last edited:
Man of Honour
Joined
21 May 2012
Posts
31,922
Location
Dalek flagship
D.P - skhynix say the best performance is from 8-Hi stack @ 256GB/s (giving 1TB/S for a full 32GB card) which is where the numbers in the presentation come from

4-Hi is 204.8GB/s bandwidth per chip , which is inverse of what you said (and also is the way SSD`s work - larger capacity being faster than smaller)

regarding 980Ti vs FuryX - now DX12 games are here , GCN is showing what it can do - which is why Nv are going to same way - 980Ti is a DX11 card which sadly cant do the big selling point of DX12 - Async compute.

@Rrof

I concede that HBM2 is larger than HBM1 but its not double

hbm2_mechanical.png

If you do the maths on the specs you have just posted HBM2 is about double the area and even more if you take into account height.
 
Caporegime
Joined
18 Oct 2002
Posts
33,188
The theoretical FP32 performance has little to do with real world gaming performance. The 980Ti has 5.6TFlops FP32 performance compared to Fiji's 8.6 yet is faster. Same goes for things like memory bandwidth, AMD has always had wider buses and more bandwidth but Nvidia has employed smarter compression techniques etc that mitigate the difference. You also have to consider in compute FP32 is not critically important actually,very strong FP64 and FP16 performance is more desirable. Either you need the precision or you don't.


And then there is the fact that GP100 is squarely aimed at HPC where there are big bucks to be earned. I Have a feeling that Nvidia will be starting to diverge their product line, the GP104 might not have anywhere near the transistor budget spent on FP64 performance.

Comparing different architectures is one thing, comparing more similar architectures is another.

580gtx to Titan had triple the FP32 performance and that is what is responsible for the gaming performance, from Titan X to Gp100, (which is a 300W card so unlikely to have higher consumer speeds) you're going from 6.1 to 10.5TF, less than double. That is the important part to realise rather than the comparison to Fury.

http://www.anandtech.com/show/6774/nvidias-geforce-gtx-titan-part-2-titans-performance-unveiled/11

Just a quick look at first result really, randomly picked a game so others may show different results. 31 fps for the 580gtx, 61 for the Titan. So tripling the FP32 performance brought around double the performance, so what gaming performance do we think less than doubling the FP32 performance from Titan X to GP100 will bring? I honestly don't know however Nvidia like to bang the drum that they have a highly efficient and high utilisation against AMD's high theoretical/lower utilisation performance, which I'm fine with, if both are a similar sized core neither is really right or wrong. However if the current architecture has good efficiency/utilisation then the primary gain in performance would come from increasing the total performance rather than increasing the efficiency and that appears to be where Gp100 lacks significantly.


It's also very disingenuous to say you either need the precision or you don't, implying that either Fp16 or FP64 are better and that FP32 doesn't mean much. FP32 is another precision level, your application requires the precision it needs and to imply you either need FP16 or FP64 is frankly silly, Fp32 is just as important. Everything is a balance, performance vs accuracy. From gaming to weather models, a model so accurate it is too slow to provide meaningful results is worthless and one that is stupidly quick but so inaccurate it provides meaningless results is also worthless. There is no binary you need this or that and anything else sucks.
 
Soldato
Joined
22 Nov 2009
Posts
13,252
Location
Under the hot sun.
£103.000 haha, I can buy two Nissan GTR with the price of one card.

this is not meant for consumers.

To put into perspective, for that money i can pay off the remaining of my 6 bedroom house mortgage, or 15 years retirement in Greece, or an F Type plus a lot of change for a used AMG 63 or SL55. And still have pocket money... for a flat
 
Caporegime
Joined
20 May 2007
Posts
39,928
Location
Surrey
To put into perspective, for that money i can pay off the remaining of my 6 bedroom house mortgage, or 15 years retirement in Greece, or an F Type plus a lot of change for a used AMG 63 or SL55. And still have pocket money... for a flat

Ah, i do love the internet.
 

GAC

GAC

Soldato
Joined
11 Dec 2004
Posts
4,688
odd how people are going on about the $100k p100 price when they did show that the previous setup to get the same performance would cost $500k in networking and stuff before you add the actual ai hardware. id call the p100 a bargin in that case.
 
Associate
Joined
28 Jan 2010
Posts
1,547
Location
Brighton
And that's the last we'll ever hear of it.

Indeed, it seems like an OTT piece of tech. Just because they can.

I know there's the whole argument about can humans see more than 60 Hz? 144 Hz? etc. to death.

But honestly, clearly we can't perceive down to 0.59ms increments. So I don't see what the point of this is.
 
Soldato
Joined
18 Feb 2015
Posts
6,492
So what do you guys suggest? They take the time off and just do what? It's called pushing the envelope, some people live to advance technology not stagnate.
 
Caporegime
Joined
18 Oct 2002
Posts
33,188
odd how people are going on about the $100k p100 price when they did show that the previous setup to get the same performance would cost $500k in networking and stuff before you add the actual ai hardware. id call the p100 a bargin in that case.

You think you can't stick 8 gpus in a box before the P100 box shown without 500k of networking equipment... lol.

The level of DP/SP performance available in that 100k box is available today for dramatically less than 100k via a couple boxes and twice as many gpus.

The price is basically to put people off by making it a non viable alternative to current tech. The way Nvidia didn't want Titan-Z to sell, it wasn't a product they wanted so they priced it where no one wanted them. They didn't want high power, hot running, noisy cards but didn't want to be seen as not competing with AMD so they released a card at a price that even Nvidia enthusiasts wouldn't buy.

They don't have real volume till Q1 next year so they say they have a box available that is so much more expensive no one gains financially by buying one today. Watch how similar boxes are financially sensible options from next year when volume is higher, costs are lower and they actually want to sell them.
 
Last edited:
Soldato
Joined
19 May 2004
Posts
3,868
GPUs with high bandwidth interconnects like that are harder to come by, of course it's cheaper but guys these pieces for hardware are for companies, if you need to spend £100,000 to make £50,000,000 then they don't care.
 
Caporegime
Joined
18 Oct 2002
Posts
33,188
Except Nvlink isn't hugely high bandwidth as an interconnect and it also looks like 99% of the stuff Nvidia has been peddling about Nvlink for the past 2 years has been NVlink 2, the current link turns out to be Nvlink 1.0, it has no connection between cpu and gpu. Considering two of the biggest uses of nvlink was increasing the gpu connection speed to system memory and virtual unified memory of which lacking nvlink between gpu/cpu basically breaks both making the first implementation nearly useless.

Basically the first version of Nvlink uses usual pci-e from cpu to pci-e switch, normal pci-e from switch to each gpu, and nvlink between gpus so the biggest limitation is still there and not at all removed yet. Currently you can already make systems with as many pci-e switches to increase gpu to gpu speed as you want, as in 16x pci-e from cpu to a switch, and the switch provides another 16 lanes and then two gpus can get 16x connection each despite only 16x total from the cpu. You can add as many switches and as much gpu to switch bandwidth as you want for basically fairly little cost.

So today you can provide what Nvlink does for a fraction of the cost using standard pci-e switches. The main touted benefit of Nvlink isn't there yet. As such Nvlink and Pascal offer very little different to existing systems and no extra cpu-gpu bandwidth so limited scaling relief that it was designed for. The main gain then is performance per gpu and density of the box. It doesn't come close to providing a cost vs performance benefit over anything Nvidia or AMD can provide in a box today already. One system costing 10k with 100k of gpus that will match the performance of two 10k systems with 10-20k of current gen gpus... so 110+k vs 40-60k for the same performance? There is no benefit at all so far at those prices.
 
Associate
Joined
26 May 2012
Posts
1,583
Location
Surrey, UK
To put into perspective, for that money i can pay off the remaining of my 6 bedroom house mortgage

You either have very little left in your mortgage, or you live in America (mansions for pennies over there), or you're well off enough to be able to afford a house these days lol. If they were selling GPUs for current house prices... I don't even think research institutes would buy them, even they have budgets!

No Pascal for consumers at GTC got me thinking as to why that might be the case... maybe Pascal isn't as great as we thought and Nvidia made a last minute cut of the conusmer Pascal segment after seeing AMD's Polaris 10 recently. Perhaps they've gone back to the drawing board for some reason? Consipracy-level speculation, but that's what this thread is mostly with no solid consumer Pascal news: speculation and rumours.
 
Back
Top Bottom