Apple M1 Ultra

HACO · 20 Mar 2022 at 15:11

humbug said:
5950X reviews between 27,500 and 28,500 in R23. up to 32,500 overclocked.

https://www.cpu-monkey.com/en/cpu-amd_ryzen_9_5950x-1749

Good shout. Not sure why I recalled it being in the 24k region rather than 28k for 5950X. Upon digging into the inner workings of Cinebench R23 it seems like it artificially favours Intel over AMD as well, due to using an older version of Intel Embree, so 5950X should in reality be rated even higher. Hopefully the new version addresses that.

Still waiting for someone to do a SPEC benchmark. Strange that these YouTubers are happy to spend 30k on launch day hardware but wouldn't buy the industry standard CPU benchmark tool for $1000. Maybe something to do with it taking 10+ hours.

humbug · 20 Mar 2022 at 15:12

I'm not suggesting the Apple M1 is anything less than impressive but their marketing is absolutely atrocious

I would like to know how Apple got to "160 watts for a 16 core Desktop processor" and relative performance of 150 vs 200?

Of course the slides doesn't say, it just makes a sweeping statement, that 16 core 160 watt processor will be something random from Intel's 6 years old back catalogue, it will not be the 5950X (again take nothing away from how impressive the M1 chip is) which at 120 watts blows the M1 Ultra out of the water in MT.

Its probably also why people reviewing Apples new M1 products completely ignore Zen 3 because it is also quite impressive, especially from a chip built on a 4 year old 7nm process, just wait until AMD get on 5nm.

The Apple M1 is ###### good, no need to prat about like this Apple.

Edit: its in the small print, how did they arrive at this slide?

HACO · 20 Mar 2022 at 15:19

humbug said:
I'm not suggesting the Apple M1 is anything less than impressive but their marketing is absolutely atrocious

I would like to know how Apple got to "160 watts for a 16 core Desktop processor" and relative performance of 150 vs 200?

Of course the slides doesn't say, it just makes a sweeping statement, that 16 core 160 watt processor will be something random from Intel's 6 years old back catalogue, it will not be the 5950X (again take nothing away from how impressive the M1 chip is) which at 120 watts blows the M1 Ultra out of the water in MT.

Its probably also why people reviewing Apples new M1 products completely ignore Zen 3 because it is also quite impressive, especially from a chip built on a 4 year old 7nm process, just wait until AMD get on 5nm.

The Apple M1 is ###### good, no need to prat about like this Apple.

The 16-core in the slides is 12900K with DDR5, but limited to 160w (good luck with that :cry:

). The slides are just beyond embarrassing and misleading. Seems like they're learning a thing or two from Intel's marketing department

They never needed to do any of this nonsense. The product is good, perf/watt definitely orders of magnitude ahead of the competition. The over the top claims about absolute performance levels is unnecessary and have definitely backfired on Apple, deservedly so.

Typical YouTubers who review Apple products are unaware that there's also AMD. But seriously AMD also has significantly better perf/watt than Intel, so appropriate to compare Apple to AMD. 5950X at 160w performs much much better than 12900K at 160w.

humbug · 20 Mar 2022 at 15:22

HACO said:
The 16-core in the slides is 12900K with DDR5, but limited to 160w (good luck with that). The slides are just beyond embarrassing and misleading. Seems like they're learning a thing or two from Intel's marketing department

They never needed to do any of this nonsense. The product is good, perf/watt definitely orders of magnitude ahead of the competition. The over the top claims about absolute performance levels is unnecessary and have definitely backfired on Apple, deservedly so.

Typical YouTubers who review Apple products are unaware that there's also AMD.

Seems like they're learning a thing or two from Intel's marketing department

Yeah...

They never needed to do any of this nonsense. The product is good, perf/watt definitely orders of magnitude ahead of the competition

Agreed it is very good.

Typical YouTubers who review Apple products are unaware that there's also AMD.

Apple use AMD's GPU's in their top end workstations (Exclusively), have done for years, they know about them, they would just rather their audience didn't.

GreatAuk · 20 Mar 2022 at 19:54

Article has a picture of the M1 ultra next to a Ryzen CPU for size comparison: https://www.tomshardware.com/news/mac-studio-dissected-m1-ultra-about-3x-bigger-than-amds-ryzen-cpus
Ludicrous seeing the size of the IHSs next to each other :cry:

HACO · 20 Mar 2022 at 19:58

GreatAuk said:
Article has a picture of the M1 ultra next to a Ryzen CPU for size comparison: https://www.tomshardware.com/news/mac-studio-dissected-m1-ultra-about-3x-bigger-than-amds-ryzen-cpus
Ludicrous seeing the size of the IHSs next to each other

A bit ludicrous, but that's M1 Ultra (CPU/GPU/etc) and 128GB DRAM under that IHS though. Hopefully someone removes that IHS to have a look under.

humbug · 20 Mar 2022 at 21:25

HACO said:
A bit ludicrous, but that's M1 Ultra (CPU/GPU/etc) and 128GB DRAM under that IHS though. Hopefully someone removes that IHS to have a look under.

We can see the die between the DRam IC's in your OP illustration, its a big chip

Its two M1 max? each max is 425mm2^, so 850mm2^ total, its 230mm2^ or one RX 6600XT larger than an RTX 3090.

Its a good jobs these are not commercial, that they are using them for their own products because these ridiculously large dies are not competitive as retail chips. They would have to cost thousands.

HACO · 20 Mar 2022 at 21:48

humbug said:
We can see the die between the DRam IC's in your OP illustration, its a big chip

Its two M1 max? each max is 425mm2^, so 850mm2^ total, its 230mm2^ or one RX 6600XT larger than an RTX 3090.

Its a good jobs the are not commercial, that they are using them for their own products because these ridiculously large dies are not competitive as retail chips. They would have to cost thousands.

This is really where Apple's perf/watt advantage comes from, their trade-off is chip size/transistor count. Intel and AMD could never sell chips at the prices they currently do if they designed their chips in the same way.

humbug · 20 Mar 2022 at 22:05

HACO said:
This is really where Apple's perf/watt advantage comes from, their trade-off is chip size/transistor count. Intel and AMD could never sell chips at the prices they currently do if they designed their chips in the same way.

I would like to see a direct no none sense comparison between the M1 and Zen 3 just to see how much better the M1 actually is. I don't think there is that much of a difference between them.

HACO · 20 Mar 2022 at 23:19

humbug said:
I would like to see a direct no none sense comparison between the M1 and Zen 3 just to see how much better the M1 actually is. I don't think there is that much of a difference between them.

I haven't seen that done properly anywhere. We can only estimate. With Andrei and Ian Cutress out of Anandtech I don't expect to see proper benchmarks anytime soon.

Proper testing of perf/watt would be to performance-match them and then compare power consumption, or power-match them and compare performance. We don't have a review that does exactly this, but we have something close from Anandtech's two reviews of M1 and M1 Max.

First single-threaded for just comparing the core microarchitectures, a 5950X (Zen 3 core at its peak - turbo 4.7 GHz) and an M1 Max (M1 P-core at its peak - 3.1 GHz), they are more or less at the same level of performance. Each wins some of the SPEC sub-scores and loses some, roughly evens out.

I do see this in my own experience as well, running computation-heavy python scripts my M1 Pro MBP and AMD 5800X are always indistinguishable in terms of performance.

If we accept these are performance-matched, comparing power consumption gives you 7-8w for M1 versus 49W for Zen 3 according to Anandtech. But obviously this isn't the optimal point for perf/watt for Zen 3. If you run the Zen 3 core at half the power, you probably only lose 15% performance, if that. So somewhere in the 3-4x advantage in perf/watt, with the above caveat.

For another example of Zen 3 at a different part of the efficiency curve, we can look at AMD 5980HS which is a comparable chip to M1 Max - while not power-matched both are very high-end laptop chips so good representatives for the platforms. M1 Max is 20% faster in both SPECint and SPECfp.

Going multi-threaded:

For a performance-matched MT test:
M1 Max (8P with 2E disabled) basically is equal to AMD 5800X, more than a little behind in SPECint, a little ahead in SPECfp. Zen 3 SPECfp MT scores are suffering not because of a Zen 3 issue but because of DDR4 memory, so wouldn't count this as a negative against Zen 3. A chip with 8 or 16 cores of Zen 3 needs either quad-channel DDR4 or DDR5, it's an AM4 platform issue rather than a Zen 3 issue. Zen 3 Threadrippers won't suffer this problem.

So M1 Max package (including RAM) uses 40w doing SPEC MT tests, 5800X uses about ~100w (excluding RAM). Same performance, gives a ~3x advantage to M1.

For a power-matched MT test:
Comparing the MT results of the M1 Max (8P) to AMD 5980HS (8-cores, 35w laptop chip), assuming these are power-matched, M1 Max is 30% faster in SPECint, and again Zen 3 is limited by DDR4 so the huge gap in SPECfp shouldn't count negatively against the Zen 3 uarch.

Anandtech measured 5980HS doing intense tests using about 42w on average. So M1 Max is about 30+% faster, at about a little less power consumption.

Obviously sub-scores vary a lot so depending on what you do you need to look at those. Average SPEC scores are intended for general computing. Personally I'm a software engineer so I care most about code compile performance, so it's the gcc subscores for me.

Seems like with M1 platform you get the same peak performance at about 1/3 of the power consumption, or you can get about 30% faster at maybe 15% less power consumption if you optimise Zen 3 for high-end laptops. So another data point.

Unless someone does proper performance and power-matched tests, this should be as good as it gets. I hope we see proper SPEC MT tests of M1 Ultra soon. Interested to see how it scales given that it's two chips fused together in a rather unique way. Also keen to see that 800 GB/s Ram in action.

Bencher · 20 Mar 2022 at 23:33

HACO said:
The 16-core in the slides is 12900K with DDR5, but limited to 160w (good luck with that). The slides are just beyond embarrassing and misleading. Seems like they're learning a thing or two from Intel's marketing department

They never needed to do any of this nonsense. The product is good, perf/watt definitely orders of magnitude ahead of the competition. The over the top claims about absolute performance levels is unnecessary and have definitely backfired on Apple, deservedly so.

Typical YouTubers who review Apple products are unaware that there's also AMD. But seriously AMD also has significantly better perf/watt than Intel, so appropriate to compare Apple to AMD. 5950X at 160w performs much much better than 12900K at 160w.

I dont really know what your issue with the 12900k is but it hits 360 points per watt on cinebench r23. Basically 12650 score at 35 watts. Zen 3 can't touch the efficiency of alderlake.

HACO · 20 Mar 2022 at 23:37

Bencher said:
I dont really know what your issue with the 12900k is but it hits 360 points per watt on cinebench r23. Basically 12650 score at 35 watts. Zen 3 can't touch the efficiency of alderlake.

No issues. I'm only pointing out Apple's misleading marketing that they powered limited 12900K at an unreasonable level. Whether I like 12900K or not (I'm not a fan), Apple can't power limit it to an arbitrary level that suits them and then claim they're faster than that point while making a graph that gives the impression 12900K's performance peaks at 160w when it doesn't. They did the exact thing with RTX 3090.

humbug · 20 Mar 2022 at 23:43

@HACO i'm talking about out side of 5 year old synthetic memory intensive benchmarks.

i'm talking about real applications people have installed on their computers which they use to get stuff done. That's what i meant by "No None Sense" I can't do much with SPEC2017.

Bencher · 20 Mar 2022 at 23:43

HACO said:
No issues. I'm only pointing out Apple's misleading marketing that they powered limited 12900K at an unreasonable level. Whether I like 12900K or not (I'm not a fan), Apple can't power limit it to an arbitrary level that suits them and then claim they're faster than that point while making a graph that gives the impression 12900K's performance peaks at 160w when it doesn't. They did the exact thing with RTX 3090.

Well sure but you said the m1 is miles ahead in perf / watt which is not the case. The 12900k matches (beats by a small margin) the m1 max with 12650 at 35 watts

HACO · 21 Mar 2022 at 00:39

humbug said:
@HACO i'm talking about out side of 5 year old synthetic memory intensive benchmarks.

i'm talking about real applications people have installed on their computers which they use to get stuff done. That's what i meant by "No None Sense" I can't do much with SPEC2017.

First, SPEC2017 is not a 5 year old benchmark, it's updated regularly

They only up the year version when it's an entire redesign of the whole thing.

SPEC is not purely synthetic either, it uses real applications for benchmarking without the UI/IO. e.g. 526.blender_r really runs blender. 602.gcc_s really runs gcc. 625.x264_s really encodes x264, etc... This isn't your Cinebench or CPU-z benchmark.

SPEC2017 is currently the gold standard for CPU benchmarks in industry, this would be the no-nonsense benchmark. It measures CPU performance (including the memory subsystem) as clearly as it gets, because it doesn't run equivalence on multiple platforms using completely different libraries that are supposed to do the same task, doesn't use off-core accelerators or non-standard instructions, etc... And not all SPEC sub-tasks are memory intensive, btw. Most of SPECfp MT is (biggest exception is blender), but FP calculations do saturate memory bandwidth so this is a real concern for multithreaded tasks, this is partly why HEDT platforms and servers have more memory channels (again this is a test that shows why you need more memory channels as you increase your cores).

SPEC benchmarks are how AWS, Google, Microsoft Azure, etc decide to buy CPUs for their datacentres. How laptop makers decide their chips. This is how every academic computer architecture paper measures their research. And this is how Intel/AMD/ARM/Apple/etc test their CPUs internally (because they know their customers test them this way). If it's good enough for the entire chip industry, it's as no-nonsense as it gets.

SPEC runs a search programme for new workloads that aren't covered by existing benchmarks. It's incredibly difficult to find new ones. You need to make a new benchmark whose result isn't predictable by blending of existing benchmarks. Your typical online CPU reviewer benchmarks don't even come close (you get R2>0.99 for all typical benchmarks across multiple CPUs, including games, cinebench, geekbench, browser, AI, etc). So if you came up with that, they pay you up to $9000:
https://www.spec.org/cpuv8/

There are already a few of these with regards to SPEC2017, other benchmarks cover them such as TPC-C, PerfKitBenchmarker or SpecJBB.

This is why you can look at your workload, split it into different tasks (e.g. compression, decompression, code compile, search algorithms), profile your tasks to see how much time is spent on each part, then cross reference it to specific SPEC sub-scores and it will give you an excellent understanding of how a chip performs doing those tasks.

This is also why Intel themselves, when they release Xeon products, they release SPEC benchmarks:
https://www.intel.co.uk/content/www/uk/en/benchmarks/xeon-scalable-benchmark.html

AMD does the same:
https://www.amd.com/system/files/documents/amd-epyc-7002-scalemp-speccpu-performance-brief.pdf
https://www.amd.com/system/files/2018-03/AMD-SoC-Sets-World-Records-SPEC-CPU-2017.pdf

As no-nonsense as it ever gets.

Once you move on from industry-standard benchmarks to random application benchmarks, tests become more valuable (for purchasing decisions, you see benchmarks for the apps you use) and far less valuable for CPU comparisons, especially as you differ between platforms. These become whole-platform benchmarks, which again are valuable for purchasing decisions, far less for microarchitecture comparisons.

humbug said:
That's what i meant by "No None Sense" I can't do much with SPEC2017.

Give me something you can't do with SPEC2017, you and I can each buy an M1 Ultra Mac

HACO · 21 Mar 2022 at 00:45

Bencher said:
Well sure but you said the m1 is miles ahead in perf / watt which is not the case. The 12900k matches (beats by a small margin) the m1 max with 12650 at 35 watts

Never seen that result. Can you link to that test? Would be interested in checking that out. I only recall a Chinese (or other Asian) YouTube video trying to do this by playing with 12900K's frequency/cores, etc... Never seen it reproduced reliably by any other. Not sure if you're talking about this or anything else.

humbug · 21 Mar 2022 at 00:58

HACO said:
First, SPEC2017 is not a 5 year old benchmark, it's updated regularly They only up the year version when it's an entire redesign of the whole thing.

SPEC is not purely synthetic either, it uses real applications for benchmarking without the UI/IO. e.g. 526.blender_r really runs blender. 602.gcc_s really runs gcc. 625.x264_s really encodes x264, etc... This isn't your Cinebench or CPU-z benchmark.

SPEC2017 is currently the gold standard for CPU benchmarks in industry, this would be the no-nonsense benchmark. It measures CPU performance (including the memory subsystem) as clearly as it gets, because it doesn't run equivalence on multiple platforms using completely different libraries that are supposed to do the same task, doesn't use off-core accelerators or non-standard instructions, etc... And not all SPEC sub-tasks are memory intensive, btw. Most of SPECfp MT is (biggest exception is blender), but FP calculations do saturate memory bandwidth so this is a real concern for multithreaded tasks, this is partly why HEDT platforms and servers have more memory channels (again this is a test that shows why you need more memory channels as you increase your cores).

SPEC benchmarks are how AWS, Google, Microsoft Azure, etc decide to buy CPUs for their datacentres. How laptop makers decide their chips. This is how every academic computer architecture paper measures their research. And this is how Intel/AMD/ARM/Apple/etc test their CPUs internally (because they know their customers test them this way). If it's good enough for the entire chip industry, it's as no-nonsense as it gets.

SPEC runs a search programme for new workloads that aren't covered by existing benchmarks. It's incredibly difficult to find new ones. You need to make a new benchmark whose result isn't predictable by blending of existing benchmarks. Your typical online CPU reviewer benchmarks don't even come close (you get R2>0.99 for all typical benchmarks across multiple CPUs, including games, cinebench, geekbench, browser, AI, etc). So if you came up with that, they pay you up to $9000:
https://www.spec.org/cpuv8/

There are already a few of these with regards to SPEC2017, other benchmarks cover them such as TPC-C, PerfKitBenchmarker or SpecJBB.

This is why you can look at your workload, split it into different tasks (e.g. compression, decompression, code compile, search algorithms), profile your tasks to see how much time is spent on each part, then cross reference it to specific SPEC sub-scores and it will give you an excellent understanding of how a chip performs doing those tasks.

This is also why Intel themselves, when they release Xeon products, they release SPEC benchmarks:
https://www.intel.co.uk/content/www/uk/en/benchmarks/xeon-scalable-benchmark.html

AMD does the same:
https://www.amd.com/system/files/documents/amd-epyc-7002-scalemp-speccpu-performance-brief.pdf
https://www.amd.com/system/files/2018-03/AMD-SoC-Sets-World-Records-SPEC-CPU-2017.pdf

As no-nonsense as it ever gets.

Once you move on from industry-standard benchmarks to random application benchmarks, tests become more valuable (for purchasing decisions, you see benchmarks for the apps you use) and far less valuable for CPU comparisons, especially as you differ between platforms. These become whole-platform benchmarks, which again are valuable for purchasing decisions, far less for microarchitecture comparisons.

Give me something you can't do with SPEC2017, you and I can each buy an M1 Ultra Mac

Give me something you can't do with SPEC2017, you and I can each buy an M1 Ultra Mac

I can think of some.

robfosters · 22 Mar 2022 at 11:42

If I’m reading some of those graphs right, does the M1 in my iPad seriously outperform the 3900X in my gaming PC?

humbug · 22 Mar 2022 at 11:45

robfosters said:
If I’m reading some of those graphs right, does the M1 in my iPad seriously outperform the 3900X in my gaming PC?

No, its a 100 Watt CPU, its not for iPads.

robfosters · 22 Mar 2022 at 11:48

humbug said:
No, its a 100 Watt CPU, its not for iPads.

So what’s the M1 in my iPad then? How is it different to the M1 in the graphs?