• Competitor rules

    Please remember that any mention of competitors, hinting at competitors or offering to provide details of competitors will result in an account suspension. The full rules can be found under the 'Terms and Rules' link in the bottom right corner of your screen. Just don't mention competitors in any way, shape or form and you'll be OK.

Meet the world's largest chip manufactured by TSMC 16nm

Soldato
Joined
28 Sep 2014
Posts
3,555
Location
Scotland
https://wccftech.com/meet-cerebras-...ore-than-56-times-the-size-of-an-nvidia-v100/

DT0LX4E.jpg

U42IfYV.png

Nvidia CEO said TSMC 300mm wafer had limited size up to 850mm2 for chips, it cant go beyond 850mm2. Volta V100 was the world largest 815mm2 GPU chip on TSMC 12nm process.

But Jensen Huang got it wrong so now Volta had lost world largest chip crown to Cerebras Systems WSE Wafer Scale Engine's 46,225mm2 AI chip contained 1.2 trillion transistors, 400,000 cores, 18GB SRAM with 9 PB/s memory bandwidth (yes that is 9 petabyte memory bandwidth per second :eek:), 100 Pb/s fabric bandwidth (yes that is 100 petabyte fabric bandwidth per second :eek:), 9 and the chip manufactured on TSMC 16nm process consumed 15 KW power.

I found Cerebras has website.

https://www.cerebras.net/

I been read about Cerebras a year ago that the CEO was seemed very confident claimed their first chip will win AI, machine learning and data center war and reckon it will beat Nvidia but I dismissed it as utter nonsense so after read the news and I changed my view that Cerebras CEO is very serious about the chip. Guess AMD, Intel and Nvidia are in huge trouble, the big three cant compete with 1.2 trillion chip now. Cerebras WSE chip devastated and unimaginable 100 petabyte fabric bandwidth and 9 petabyte memory bandwidth just made NVLink and future HBM3 and HBM4 looked like obsolete.

Hopefully Jensen Huang had changed his view a while ago after heard about Cerebras Systems WSE realised Jensen can create GPU far beyond 850mm2 and Nvidia can develop Volta successor with 1.2 trillion GPU for data center and the GPU can do full scene ray tracing consume a lot less than 15KW power in 2020.
 
Last edited:
I think you are being a bit enthusiastic there - it is essentially still a bunch of chips stitched together and given the issues cooling it the resulting performance from keeping the frequency to feasible levels probably won't outperform a bunch of separate chips stitched together via a longer interconnect for compute stuff.

I can't see it competing with those 3 - it will probably be king of a specific niche but not have broad application.
 
I think you are being a bit enthusiastic there - it is essentially still a bunch of chips stitched together and given the issues cooling it the resulting performance from keeping the frequency to feasible levels probably won't outperform a bunch of separate chips stitched together via a longer interconnect for compute stuff.

I can't see it competing with those 3 - it will probably be king of a specific niche but not have broad application.

Accorded to Cerebras CEO Andrew Feldman, a single Nvidia DGX-2 system cant compete with a Cerebras WSE silicon so Nvidia will need 20 to 30 DGX-2 systems in NV switches to compete with Cerebras WSE silicon. 1 DGX-2 system has 2 petaflops performance while 20 to 30 DGX-2 systems in NV-Switches has 40 to 60 petaflops performance. AMD's future Frontier supercomputer due in 2021 will have 100 cabinets with 1.5 exaflops performance while a single cabinet will have 15 petaflops mean Cerabras WSE will be up to 4 times faster than EPYC 4 CPUs and Arcturis GPU successor that do not exist yet

https://medium.com/syncedreview/hot...-ai-chip-as-big-as-a-notebook-why-4d068429349

It seemed Cerebras did an amazing job stitched chips together really beautiful, very clean and very neat resembled of bank card security chip, much better than AMD shoddy poorly job with HBM, HBM2, CPU chiplets and I/O die. Look like it probably has 84 chips so each chip would have 214MB, 107TB/s memory bandwidth and 1.19PB/s fabric bandwidth. Wow insanely ridiculous fast than NVLink Fabric's 900GB/s bandwidth and non existed HBM3, HBM4.

If Cerebras WSE will do a good job with benchmarks and lots of press in the next few months then I think Nvidia will have plan to acquire Cerebras.
 
Oh wow that power draw though!

Equates to about 20 pounds a day for electricity running for 8 hours

A single Cerebras WSE silicon has 15KW power. 1 Nvidia DGX-2 system has 10KW power so to match Cerebras WSE performance would need 20 Nvidia DGX-2 systems with 200KW power or 30 Nvidia DGX-2 systems with 300KW power. Much faster and 20 times more energy efficient than AMD upcoming Frontier 2021 supercomputer cabinet with 300KW power.
 
I think you are being a bit enthusiastic there - it is essentially still a bunch of chips stitched together and given the issues cooling it the resulting performance from keeping the frequency to feasible levels probably won't outperform a bunch of separate chips stitched together via a longer interconnect for compute stuff.

I can't see it competing with those 3 - it will probably be king of a specific niche but not have broad application.

This chip is very specific only for deep learning and some related machine learning applications. It is purposely extremely specialized as that is what give it the enormous efficiency.


I think the real question is if it is cheaper and/or more power efficient to have equalized performance using this giant wafer sized "chip" vs a cluster of dozens of volta sized GPUs.

Nvidia does need to continue removing GPU cocnepts form their HPC line to remain competitive though. A large chunk of the volta die is still dedicated to graphics which is a bit absurd. AMD have barely even started seperating functionality.
 
I'd love to know how they're going to drag 15kw of heat off something that size evenly....that fact all they are showing is bare wafers makes me quite skeptical about the practicalities of the final product.
 
No room in my case to stick 4 of them.:D

Having said that next year the 3XXX series of cards from NVidia will be using the above chips if only to get RTX and DLSS working at 30fps.:eek:
Haha.

Looking forward to the 3000 series, but will only be buying 1 :p

When is your prediction Kaaps? I am thinking Q1 2020 at worst Q2.
 
Haha.

Looking forward to the 3000 series, but will only be buying 1 :p

When is your prediction Kaaps? I am thinking Q1 2020 at worst Q2.

Lots of rumours in Q1

Actual midrange products (3070 and 3080) end of June Q2.

The above is only my guess but I think it is going to take NVidia a while to get their act together with 7nm as anything they produce will still be on quite large chips if they want to include RTX. A future 3070 for example is going to need to perform at around the same level or better than the current RTX Titan if the new tech/cards are going to have any credibility.

NVidia can hide behind the 2XXX series of cards being the first gen of cards to use RTX and hence the performance is a bit lower than users would like but that will be wearing a bit thin for the 3XXX series.
 
https://www.zdnet.com/article/cerebras-ceo-big-implications-for-deep-learning-in-companys-big-chip/

"The more interesting thing would be to divide up work so that some of your 400,000 cores work on one layer, and some on the next layer, and some on the third layer, and so on, so that all layers are being worked on in parallel," he muses. One effect of that is to vastly multiply the size of the parameter state that can be handled for a neural network, he says. With a GPU's data parallelism, any one GPU might be able to handle a million parameters, say. "If you put two GPUs together [in a multi-processing system], you get two machines that can each handle a million parameters," he explains, "but not a machine that can handle 2 million parameters — you don't get a double."

With the single WSE, it's possible to support a four-billion parameter model. Cluster the machines together, he suggests, and "you can now solve an eight-billion or 16-billion parameter network, and so it allows you to solve bigger problems by adding resources."

:eek:

Wow a single WSE can handle data with 4 billion parameter model that is 4000 times more than a Nvidia Volta V100 GPU, Intel upcoming Xe GPU for datacentre and AMD Radeon Instinct Vega, Navi GPU and even the fastest supercomputers with thousands of GPUs still limited with 1 million parameter model. 2 WSE machines on cluster can handle 8 billion parameter and 4 WSE machines together can handle 16 billion parameter. Bloody hell, AMD upcoming Frontier supercomputer due in 2021 with 100 cabinets with thousands of AMD Radeon Instinct GPUs on network are limited with 1 million parameter data model will use over 30 MW power. I cant imagine 100 cabinets water cooling 100 WSE will handle 400 billion parameter data model to solve biggest problems ever used only just 1.5 MW power.

Jensen Huang, Lisa Su and Raja Koduri all three will have excessive sweating and very concern now.
 
NVidia can hide behind the 2XXX series of cards being the first gen of cards to use RTX and hence the performance is a bit lower than users would like but that will be wearing a bit thin for the 3XXX series.
Agreed. I am expecting 3070 to at least have the grunt of 2080 Ti but really should have more RT cores or performance due to other enhancements.
 
Back
Top Bottom