1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

How many SM's in full fat TU-102 and TU-104 ?

Discussion in 'Graphics Cards' started by ToOo, Aug 21, 2018.

  1. ToOo

    Associate

    Joined: Oct 16, 2012

    Posts: 73

    As far as i can see this information hasn't been publically anounced but i'm thinking maybe we have some information to work from.

    Nvidia's Maxwell and Pascal architectures have 128 ALUs (aka cuda cores) per Streaming Multiprocessor (SM) group. But what about Turing, does that hold true ?

    GP-102 for example has 30SMs for a total of 3840 cuda cores but this was only ever fully released in the Titan Xp, while Titan X(p) and 1080Ti only ever exposed 28SMs.

    The image below is from the RTX2080TI page and i belive it represents TU-102. I've highlighted it for clarity but it would seem to show 6 sets of 2x3 units arranged in two blocks on either side of the chip, in theory correlate to 36 SMs in total for TU102 if we are talking 128 ALUs per SM, it's worth noting that i could be out by a factor of 2 and TU-102 might have 72 SMs if it's 64 ALUs per SM as with Volta.
    [​IMG]

    Since we know that RTX2080TI is 4352 cuda cores, that would imply only 34(68) of the possible 36(72) SM's are activated which leaves room for a theoretical 4608 cuda core Titan R card to sit above 2080TI in the product stack.

    Unfortunately i can't seem to find a simlar graphic for TU-104 so there's not really enough information to speculate further.

    EDIT: As Muon points out below just look at RTX 5000 and 6000, TU-104 = 3072 and TU-102 = 4608 cores.
     
    Last edited: Aug 22, 2018
  2. SupernovaUK

    Associate

    Joined: Jul 23, 2014

    Posts: 26

    I'm confused by all the negative speculation around the 2080Ti regarding its relative performance compared to the 1080Ti.

    Ignoring the ray tracing and AI components of the chip there are 768 additional SMs over the 1080Ti (4352 - 3584). Not sure how the clock speeds will play out but they should be very close.

    I watched the Nvidia launch and heard Jensen Huang saying there is a new architecture for these chips and the SMs will have significantly improved performance over Pascal. So just on additional SMs and the new architecture surely these cards will be significantly faster than a 1080Ti before any additional new capabilities come into play. When you add in the sizable uplift in memory bandwidth and the new capabilities of the AI segment of the chip to improve performance I'm expecting it to be a sizable step up in performance over pascal.

    I'm looking forward to the benchmarks and seeing if all this negative speculation is justified.
     
  3. Minstadave

    Capodecina

    Joined: Jan 8, 2004

    Posts: 24,097

    Location: Rutland

    If the performance was something to write home about, we’d have been told about it.

    Yes the 2080Ti has a fair few more CUDA cores but that ignores the fact that it’s an entire price tier more expensive than the 1080Ti.
     
  4. crinkleshoes

    Capodecina

    Joined: Jun 9, 2009

    Posts: 11,607

    Location: London, McLaren or Radical

    SP alone at same clocks would be a 20% bump.

    With IPC improvements and GDDR6 bandwidth improvements... I'm hoping for closer to 30%
     
  5. iakhtar

    Gangster

    Joined: Oct 29, 2009

    Posts: 144

    We don't know anything about heat or power either, it might not be able to sustain high clocks as well as pascal for example, wouldn't be surprising with that big die.
     
  6. muon

    Capodecina

    Joined: Nov 8, 2006

    Posts: 17,528

    Location: London

    Isn't it simply a case of looking at the Quadro RTX.

    We know that has 36SMs with 128 shaders (or 64SMs with 64) resulting in 4608 shaders.

    Answered.

    2080Ti will outperform the 1080Ti, more shaders. But the 2080 has fewer than the 1080Ti by quite some margin.
     
  7. ToOo

    Associate

    Joined: Oct 16, 2012

    Posts: 73

    Yes, yes it is. I didn't spot that at all. Thanks :)
     
  8. AthlonXP1800

    Mobster

    Joined: Sep 28, 2014

    Posts: 2,672

    Location: Scotland

    Same with 1080 has fewer CUDA cores than 980 Ti.
     
  9. Kaapstad

    Man of Honour

    Joined: May 21, 2012

    Posts: 27,757

    Location: Dalek flagship

    I suspect that the full fat Turing chip will pack 5120 SP cores, 384 bit bus, 12gb of GDDR6 and will appear as a Titan variant some time in the near future.

    It will also be called the TU100 chip is my guess.

    I also don't think we will see it until NVidia have sold as many 2080 Ti cards as the market will take as the price of the full fat chip will be eye watering.
     
  10. ubersonic

    Capodecina

    Joined: May 26, 2009

    Posts: 20,385

    I dunno about the TU-102, but the TU-104 has a max takeoff weight of 78,100kg, so factor in fuel and you should be able to get a good 500,000+ on there, it's the physical size of the cards and their boxes that will be the limiting factor.
     
  11. LeMson

    Wise Guy

    Joined: Mar 21, 2012

    Posts: 1,999

  12. Mauller

    Mobster

    Joined: Feb 7, 2015

    Posts: 2,663

    Location: In Space

    I don't think anything like that will happen, not until 7nm at least. But I think Nvidia have fully segmented their compute and gaming with 2 very different architectures. Volta and Turing.
     
  13. COYS

    Hitman

    Joined: Mar 30, 2007

    Posts: 868

    I'm thinking they run hotter, hense the Founders Edition now coming with a dual fan design
     
  14. Kaapstad

    Man of Honour

    Joined: May 21, 2012

    Posts: 27,757

    Location: Dalek flagship

    Have a look at the 2080 Ti PCB there is room for 12 memory chips or in other words there is something bigger in the wings.

    I think the only reason we don't see the full (5120 SP) chip is yields and cost that goes with it.
     
  15. Mauller

    Mobster

    Joined: Feb 7, 2015

    Posts: 2,663

    Location: In Space

    TU102 is already 754MM^2 in size, there likely is nothing bigger than it till 7nm. And TU102 is 128*6*6 Cuda cores, so 4,608.

    They are also using 11 memory Packages with these parts being cut down.

    The Quadro's also use completely different PCB's to these consumer parts.
     
  16. Kaapstad

    Man of Honour

    Joined: May 21, 2012

    Posts: 27,757

    Location: Dalek flagship

    GV100 die as used in the Titan V is bigger.

    There is a reason NVidia are calling the die used in the 2080 Ti the TU102 and that is because there is a TU100 die lurking around somewhere.

    Also have you tried adding 512 + 4608 together, you get a number that has appeared somewhere before, or putting it another way 10% of the TU102 is probably disabled for yield reasons.
     
  17. Mauller

    Mobster

    Joined: Feb 7, 2015

    Posts: 2,663

    Location: In Space

    512 added doesn't work bud. The Sm count is not right, you would need to add 768 to the overall design giving Volta SM to GPC ratios. But in doing that you need to add more tensor, integer and RT cores to keep the ratio. Which will make it much bigger. Likely bigger than Volta and not worth the extra SM's on 12/16nm

    Apparently there are also tiring parts bring made on 16nm, making them around 780mm^2.

    Bigger die also means more variability per chip and higher possibility of fewer dies that work to spec. NVidia are already pushing it with TU102 for a consumer part. Anything higher core will be 7nm.

    Turing is either going to be transferred to 7nm and become the TU2xx series or it is going to be a very short lived one since they already launched the x80TI part.
     
  18. Silent_Scone

    Capodecina

    Joined: Sep 5, 2011

    Posts: 12,348

    Location: Surrey


    Precicely. It would be closer to Titan V <price>, which is crazy. Power draw would also be interesting lol.
     
  19. Meaker

    Mobster

    Joined: May 19, 2004

    Posts: 3,459

    They are not going to pay a billion for the masks to get an extra 10% die area on the same design.
     
  20. Kaapstad

    Man of Honour

    Joined: May 21, 2012

    Posts: 27,757

    Location: Dalek flagship

    The area is probably already there but used to increase yields on defective chips.

    For example the GV100 chip has more than the 5120 cores you actually get to use.

    To produce the TU 102 chip with 4352 cores and all its other features for the price and yield NVidia want means using bigger chips with defects on them.