NVIDIA Breathes Life into Kepler with the GK210 Silicon

Kaapstad · 17 Nov 2014 at 17:34

NVIDIA Breathes Life into Kepler with the GK210 Silicon

NVIDIA's "Maxwell" architecture may have got a rather low-key debut with the GeForce GTX 750 Ti, but nobody saw its performance-segment derivative, the GM204 silicon, driving the GeForce GTX 980 and the GTX 970. The new architecture makes its predecessor, the "Kepler" look inefficient in comparison. It looks like NVIDIA still thinks Kepler is competitive to competition from AMD (GCN) and Intel (Knights Corner), in the high-performance computing era.

The problems here are NVIDIA already launched a GK110 based Tesla HPC card, and its big "Maxwell" chip is nowhere in sight. The GM204 has limited memory bandwidth, and its texture-compression mojo can't bail out bandwidth-hogging HPC applications. The solution? Develop a new big silicon based on "Kepler." Enter, the GK210. That's right, the G-K-210. Launched today with the Tesla K80 dual-chip HPC accelerator, this chip could feature design improvements over the GK110, while offering memory bandwidth and sizes not possible on the GM204.

The Tesla K80 accelerator is a dual-chip solution, with two GK210 chips. Each of the two features 2,496 CUDA cores, totaling 4,992 in all. Each chip features a 384-bit wide GDDR5 memory interface, wired to 12 GB of memory. That gives the K80 a staggering 24 GB of memory, across two 240 GB/s memory interfaces. 240 GB/s may not seem like a figure a GM204 can't achieve, but we're beyond consumer (GeForce) and enterprise (Quadro) market-segments here, entering the mission-critical (Tesla) one. NVIDIA is clocking the card very conservatively. The Tesla isn't a graphics card to begin with. Its core runs at 562 MHz, which can spool up to 875 MHz, and the memory ticks at 5.00 GHz, less than the 6 GHz on the Tesla K40.

So what's changed between the GK210 and the GK110? For one, it appears to be extremely energy efficient. The Tesla K80 comes with passive cooling (relies on the air-flow of the rackmount blade it's part of), and has a TDP rating of 300W (150W per GPU system). In comparison, the single-chip Tesla K40 is rated at 235W. The Boost clocks of both chips are identical, even if the nominal clocks on the Tesla K80's GK210 are marginally lower, and the memory clocks lower by 15%. Another technical difference between the GK210 and the GK110 is under the hood.

While both chips are based on the "Kepler" architecture, GK210 features double the shader cache amount. Each of the 15 streaming multiprocessors (SMXs) features 128 KB of shader cache, compared to 64 KB per SMX on the GK110. The GK210 also has a 512 KB register file per SMX, double the size of the 256 KB register file size, of the GK110. A larger register file size means that the number of variables a shader can use is increased. If an operation runs out of register, then those variables have to sit in the chip's limited last-level cache, taking more clock cycles to fetch, or even worse, the GPU memory, which is several orders of magnitude slower. These two changes could step up the GPU's serial processing performance slightly, while retaining its inherent parallel processing advantages, which could really help in an HPC environment. In other words, we won't hold our breath for a consumer GeForce debut of this chip.

http://www.techpowerup.com/207265/nvidia-breathes-life-into-kepler-with-the-gk210-silicon.html

I don't think we will see big Maxwell for a long time.

Kaapstad · 17 Nov 2014 at 17:36

NVIDIA Unveils Tesla K80 Dual-Chip Compute Accelerator

NVIDIA today unveiled a new addition to the NVIDIA Tesla Accelerated Computing Platform: the Tesla K80 dual-GPU accelerator, the world's highest performance accelerator designed for a wide range of machine learning, data analytics, scientific, and high performance computing (HPC) applications.

The Tesla K80 dual-GPU is the new flagship offering of the Tesla Accelerated Computing Platform, the leading platform for accelerating data analytics and scientific computing. It combines the world's fastest GPU accelerators, the widely used CUDA parallel computing model, and a comprehensive ecosystem of software developers, software vendors, and datacenter system OEMs.

The Tesla K80 dual-GPU accelerator delivers nearly two times higher performance and double the memory bandwidth of its predecessor, the Tesla K40 GPU accelerator. With ten times higher performance than today's fastest CPU, it outperforms CPUs and competing accelerators on hundreds of complex analytics and large, computationally intensive scientific computing applications.

Users can unlock the untapped performance of a broad range of applications with the accelerator's enhanced version of NVIDIA GPU Boost technology (PDF), which dynamically converts power headroom into the optimal performance boost for each individual application.

Industry-Leading Performance for Science, Data Analytics, Machine Learning
The Tesla K80 dual-GPU accelerator was designed with the most difficult computational challenges in mind, ranging from astrophysics, genomics and quantum chemistry to data analytics. It is also optimized for advanced deep learning tasks, one of the fastest growing segments of the machine learning field.

"NVIDIA GPUs have become the de facto computing platform for the deep learning community," said Yann LeCun, director of AI Research at Facebook, and Silver Professor of Computer Science & Neural Science at New York University. "Because the accuracy of deep learning systems improves as the models and datasets get larger, we always look for the fastest hardware we can find. The Tesla K80 accelerator, with its dual-GPU architecture and large memory, gives us more teraflops and more GB than ever before from a single server, allowing us to make faster progress in deep learning."

The Tesla K80 delivers up to 8.74 teraflops single-precision and up to 2.91 teraflops double-precision peak floating point performance, and10 times higher performance than today's fastest CPUs on leading science and engineering applications, such as AMBER, GROMACS, Quantum Espresso and LSMS.

"The Tesla K80 dual-GPU accelerators are up to 10 times faster than CPUs when enabling scientific breakthroughs in some of our key applications, and provide a low energy footprint," said Wolfgang Nagel, director of the Center for Information Services and HPC at Technische Universität Dresden in Germany. "Our researchers use the available GPU resources on the Taurus supercomputer extensively to enable a more refined cancer therapy, understand cells by watching them live, and study asteroids as part of ESA's Rosetta mission."

Key features of the Tesla K80 dual-GPU accelerator include:•Two GPUs per board - Doubles throughput of applications designed to take advantage of multiple GPUs.
•24GB of ultra-fast GDDR5 memory - 12GB of memory per GPU, 2x more memory than Tesla K40 GPU, allows users to process 2x larger datasets.
•480GB/s memory bandwidth - Increased data throughput allows data scientists to crunch though petabytes of information in half the time compared to the Tesla K10 accelerator. Optimized for energy exploration, video and image processing, and data analytics applications.
•4,992 CUDA parallel processing cores - Accelerates applications by up to 10x compared to using a CPU alone.
•Dynamic NVIDIA GPU Boost Technology - Dynamically scales GPU clocks based on the characteristics of individual applications for maximum performance.
•Dynamic Parallelism - Enables GPU threads to dynamically spawn new threads, enabling users to quickly and easily crunch through adaptive and dynamic data structures.
The Tesla K80 accelerates the broadest range of scientific, engineering, commercial and enterprise HPC and data center applications -- more than 280 in all. The complete catalog of GPU-accelerated applications (PDF) is available as a free download.

More information about the Tesla K80 dual-GPU accelerator is available at NVIDIA booth 1727 at SC14, Nov. 17-20, and on the NVIDIA high performance computing website.

Users can also try the Tesla K80 dual-GPU accelerator for free on remotely hosted clusters. Visit the GPU Test Drive website for more information.

Availability
Shipping today, the NVIDIA Tesla K80 dual-GPU accelerator will be available from a variety of server manufacturers.

http://www.techpowerup.com/207260/nvidia-unveils-tesla-k80-dual-chip-compute-accelerator.html

Dicehunter · 17 Nov 2014 at 18:19

Kaapstad said:
I don't think we will see big Maxwell for a long time.

Shame, I was hoping to see another of your epic rigs

bru · 17 Nov 2014 at 18:52

So do you think there will be a step down from the TitanZ as in the TitanY....bother

kazuya1337 · 17 Nov 2014 at 19:00

You will see a big Maxwell as soon as the 390X is released, why would they do anything else.

Disco_P · 17 Nov 2014 at 20:45

I wouldn't mind a lower power consumption version of kepler.

Ayahuasca · 17 Nov 2014 at 20:57

Stunning

Orangey · 17 Nov 2014 at 21:15

How long do you think they have to wait before putting out a GM200 Tesla card after this then? Would they be stockpiling dies right about now?

Kaapstad · 17 Nov 2014 at 21:30

Orangey said:
How long do you think they have to wait before putting out a GM200 Tesla card after this then? Would they be stockpiling dies right about now?

I think the above means we are at least 9 months away from anything with Big Max, as NVidia are not going to launch two different products for the same market back to back. They are going to want to recoup launch/development costs and make some money first.

This also means that Big Max Titans are likely to be at least a year away too as NVidia will launch the pro versions of the cards first.

The thing that could change this is if AMD launch something very fast in the next 6 months, now where is AMDMatt when we need him to get his buddies on the case.

andybird123 · 17 Nov 2014 at 21:49

Kaapstad said:
I think the above means we are at least 9 months away from anything with Big Max, as NVidia are not going to launch two different products for the same market back to back. They are going to want to recoup launch/development costs and make some money first.

This also means that Big Max Titans are likely to be at least a year away too as NVidia will launch the pro versions of the cards first.

The thing that could change this is if AMD launch something very fast in the next 6 months, now where is AMDMatt when we need him to get his buddies on the case.

unless GM200 is totally dedicated to gaming and it's DP compute is still 1/8 or 1/16 or whatever it is with maxwell, then this dual GPU Kepler card makes perfect sense for a professional card that maxwell on 28nm can't compete with

Orangey said:
How long do you think they have to wait before putting out a GM200 Tesla card after this then? Would they be stockpiling dies right about now?

my best guess would be that we aren't going to see a GM200 tesla card, it will be GM210 or 220 or whatever they call it when they go to 16nm and have the die space to dedicate to DP performance, in which case yes, 9 months away

systemerror · 17 Nov 2014 at 21:50

Well means all the 980 adopters should see their monies worth atleast.

pandem0nium · 17 Nov 2014 at 22:44

"Rise from your grave!"

CAT-THE-FIFTH · 18 Nov 2014 at 09:20

The main reason large chip NV cards exist is for pro markets where there are large margins and the gamers probably get the cast offs.

Edit!!

I have a feeling why Nvidia stuck with a modified GK110. Under constant compute loads the GM204 does not show massive improvements in performance/watt over the previous generation - its under constantly changing loads like gaming you see the major improvements with the current Maxwell GPUs.

subbytna · 18 Nov 2014 at 10:42

bru said:
So do you think there will be a step down from the TitanZ as in the TitanY....bother

Genuine lol at that

Locky · 18 Nov 2014 at 10:54

These 970's have been one of my better purchases in recent times. the 7970's they replaced where also very good servants too. So long live my 970's for a while

MasterOC · 18 Nov 2014 at 12:29

kazuya1337 said:
You will see a big Maxwell as soon as the 390X is released, why would they do anything else.

Pricing difference ... Don't think even Nvidia would bring out another Titan to compete with the 390x ... Different markets almost .

It'll be the cut down big boy first like the 780

Competitor rules

NVIDIA Breathes Life into Kepler with the GK210 Silicon

More options

Kaapstad

Kaapstad

Kaapstad

Kaapstad

Dicehunter

Dicehunter

bru

bru

kazuya1337

kazuya1337

Disco_P

Disco_P

Ayahuasca

Ayahuasca

Orangey

Orangey

Kaapstad

Kaapstad

andybird123

andybird123

systemerror

systemerror

pandem0nium

pandem0nium

CAT-THE-FIFTH

CAT-THE-FIFTH

subbytna

subbytna

Locky

Locky

MasterOC

MasterOC