Nvidia Pascal Architecture Detailed Technical Analysis – Stacked DRAM and NV Link

Gregster · 11 Apr 2014 at 22:31

[Editorial] Before I begin, my humble warning that this post might get a little technical. This generation of graphic cards is not about brute power, but efficiency and intelligent design. To achieve the maximum throughput while maintaining a very small foot print. Basically, true progress; and its not just about adding more transistors on a die. Nvidia demoed two critical technologies on GTC this year, namely NV Link and Stacked DRAM aka ’3D Memory’. However they understandably failed to give a lot of technical details since the demo was for the general audience, but I will try to take care of that today, albeit slightly late.
NVIDIA Pascal GPU Chip Module

Nvidia Pascal: Using CoW (Chip-on-Wafer) based 3D Memory to Achieve the Next Gen GPU

Lets begin with 3D Memory. Now most of you know what SoC (System-on-Chip) means, but now we have a slightly less used term which I will take the opportunity to explain. Basically the CoW (cue mundane bovine jokes) or Chip on Wafer design is a technique used to plant a single logic circuit directly over or under a stack of wafers. Basically the chips are stacked and Silicon punched through in vertical pillars called TSV (Through Silicon Vias) till the Control Die. In this case, it means that the DRAMs that are stacked will be controlled by a single logic circuit and henceforth referred to as a ‘Chip-on-Wafer’ design. In all probability the Nvidia 3D RAM will be using the JEDEC HBM standard, which funnily enough was developed by JEDEC and AMD. However the actual production will most likely be carried out by SK Hynix. Pascal’s Stacked DRAM Design is most probably going to come in 2 modules of configuration (since they mentioned the 1Tb/s mark):
Configuration 1: 2x Stack (512 Gb/s) + 1 (Control Die). This is called 2-Hi HBM.
Configuration 2: 4x Stack (1024 Gb/s) + 1 (Control Die) This is called 4-Hi HBM.
Nvidia might even bring a configuration standard in its Pascal Architecture between these 2 ‘traditional’ configs (3 Stacks) but that is unlikely. So here we have an interesting question. Green has promised us speeds up to 1 Terabits per Second. So there is more or less no question that the high end GPUs will ship with 4 + 1 layers of stack , however what about the middle and lower order? Will they also ship with the same layers of stack or a lesser configuration. If I were to make an educated speculation I would put my money on multiple configurations scaled across the spectrum of GPUs. As in the middle order to have 2 + 1 layers, while as the top order to have the 4 + 1 layers. Continuing the same speculation, HBM utilizes a low operating frequency and low power requirement. Therefore Nvidia’s Stacked DRAM will most probably operate at around 1.2V with frequency around 1Ghz. Here is a comparison chart between our traditional GDDR5 Ram ad x2 ad x4 stacks of DRAM with the control dies.

You might have noticed that the 3D Memory to have a Dual Command input feature. The reason for this is that a single layer of 3D Memory has two RAM modules. I.e. an 8GB 4-Hi HBM RAM would be divided into 4 layers with each layer having, 1 + 1 GB configuration. This is what enables the Dual Command feature. Of course if we say increase every module to 4 GB, then on a 4-Hi HBM RAM we ca achieve a 16GB configuration and vice versa if we want to make it smaller. However in this generation, don’t expect anything above the 8GB 4-Hi HBM Ram configuration. Of course we can scale it to 8-Hi HBM Ram with a maximum capacity of 32 GB but that kind of memory in desktop GPUs is unlikely.

Nvidia Pascal: NV Link – A Very High Speed Interconnect
NVIDIA NVLINK GPU Scalability

I would be very surprised if Nvidia’s ’3d memory’ is not utilized in AMD Next Gen too, considering that they are the ones who actually came up with HBM Ram not to mention the standard is open. However it is a slightly different story with the NV Link, which appears to be more or less proprietary. NV Link is going to come in 3 different layouts in the upcoming Pascal Architecture. The first one is irrelevant to us but I am going to touch upon it slightly anyways:
Advertisements

1. NV Link designed for the IBM Power CPUs
2. NV Link designed for the GPU – CPU connection via your normal PCI Express slot
3. NV Link designed for the Onboard ARM – GPU Connection of future Nvidia GPUs.
Since IBM does not have a HPC/Server GPU solution, it has decided to pursue a very promising partnership with Nvidia. Pascal Architecture’s NV Link would see Nvidia, getting out of its comfort zone of x86 and making its GPUs interface with IBM’s Power CPUs. On the PCI Express mode the NV Link, which is basically a super high speed serial interconnect, uses an embedded clock differential signaling technique aka differential signal. This allows it to achieve nearly 5 – 10 times the speed of a PCI Express 3.0 running in x16 Mode. The actual speeds is though to be along the lines of 80GB/s to 230GB/s.
What Nvidia is going for is basically a complete point-to-point design where the processors are connected directly to each other without going through a third party channel. However this means that the current PCI-E slot is no good. So NV Link will have to be physically included in Motherboards of the future. Rumors put NV Link to be a glorified Mezzanine connector, which will allow, bluntly put, a socketed GPU. Since Pascal already has on package memory, Nvidia’s custom bus ‘NV Link’ with the help of a Mezzanine interface will allow never seen before speeds in GPU data transfer. Not only that but a custom Mezzanine connector will be able to supply far more than the 75W present in our PCI-E slots today, allowing GPUs to be completely powered by the NV Link. However Anandtech has raised some very valid criticism. The criticism being that NVLink is in no position to replace PCI-E anytime soon. Best case scenario being GPUs with dual NV Link – PCI-E connectivity and the server market (IBM) taking hold of NV Link completely. I won’t go into much more detail on how the NV Link functions via blocks, since that has already been covered multiple times. So umm, yeah, thats all folks.

Read more: http://wccftech.com/nvidia-pascal-a...-analysis-stacked-dram-nv-link/#ixzz2ycEpG1W7

Gregster · 12 Apr 2014 at 00:48

Yer Cat, CD keeps whinging on forums and blogs that nVidia is making a big thing of stacked DRAM when it will be used on AMD as well.

Gregster · 12 Apr 2014 at 10:23

Charlie is a whiney little kid and nVidia said "no" to him at some stage, so he has made it a personal mission to never give credit where it is due and big up AMD at every opportunity.

Good job you are not like that DM

Gregster · 12 Apr 2014 at 12:50

drunkenmaster said:
Good job you don't ever read semiaccurate and continue to spout nonsense.

I can't afford to pay the $1000 a year subscription.

Looking at SA, the first article is about Kabini.

http://semiaccurate.com/2014/04/09/amd-launches-first-system-socket-soc-kabini/

Congratulations to AMD for turning an otherwise boring product into a compelling value play in the entry-level and small formfactor PC market.S|A

http://semiaccurate.com/2014/04/08/amd-launces-r9-295x2-faster-full-speed-dual-hawaii-card/

http://semiaccurate.com/2014/04/08/amd-launces-r9-295x2-faster-full-speed-dual-hawaii-card/

Since the Titan Z didn’t have a specified clock or ship date, anyone think Nvidia got wind of the R9 295X2 and decided to preempt the announcement? It looks like they didn’t get the specs or pricing of the 295 because they were way way off the price and performance mark.

http://semiaccurate.com/2014/03/27/denver-details-make-nvidias-explanations-tenuous/

With Nvidia damage control in full swing, lets take a look at why the Denver core is having problems. If you understand the underlying tech, some of the problems are obvious.

Everything he writes about nVidia is defamatory but AMD is all rosey and pechy. I don't have a problem with that at all and find it quite amusing but it is clear that nVidia took his toys away or didn't allow him something so like a spoilt brat, he just whines at them constantly. I enjoy reading it in truth

Competitor rules

Nvidia Pascal Architecture Detailed Technical Analysis – Stacked DRAM and NV Link

More options

Gregster

Gregster

Gregster

Gregster

Gregster

Gregster

Gregster

Gregster