You know since MMX instructions they are all matrix operations. And you don't need separate hardware, you just use the FPU unit for all of them.
No, MMX is not matrix oeprations - they are vectored SIMD operations.
Why bother with any electronics, you can do matrix operations with steam valves. So what is your point?
The whole purpose of tensor cores is to accelerate various matrix oeprations to be hundreds of times faster than can be achieved through the use of regular compute units in GPUs, and not only that, the computation can be done in parrlalle freeing up the CU to handle the rest of the rendering (or computation) pipeline.