Would you care to explain this for numpties like me?
Okay - I'll give it a try!
Think of a matrix as like a 2D array of numbers; say N x N numbers arranged in a table.
Most scientific computing makes heavy use of "matrix - vector" multiplication, where the matrix is multiplied by a 1D array of numbers (a vector). For a matrix of size "N", this takes "N^2" floating point operations. If you want to make this sort of operation parallel, you are usually limited by how fast you can transfer data around the machine (i.e. by memory bandwidth). For these sorts of applications GPUs aren't so effective, because you saturate the memory bandwidth long before you max out the compute capability.
But, for certain scientific computing applications, you can formulate the problem in terms of matrix-matrix multiplications instead. Here you multiply one matrix (2D array) with another. This costs "N^3" operations, so you end up transferring a similar amount of data around, but need "N" times more computations. These are the sort of applications that benefit the most from GPUs, because you can unleash the full computing power of the GPU (... at least if your matrices are big enough, i.e. for large N).
Previously, once you had moved your two matrices into the GPU memory, you would use the standard FP32 or FP64 cores to do the calculation - that is, you set up a list of instructions for the CUDA cores to carry out in order to do the multiplication. With the "tensor cores" it seems that the *entire matrix-matrix multiplication* is done in hardware. So, the only thing you can send to tensor cores is a pair of matrices, but you will get the result much much faster than by going through the general purpose FP32 cores.
So, for any scientific application that relies heavily on matrix-matrix multiplications (where all / most of the numbers in the matrix are non-zero), could see a further ~10x speedup from this setup, on top of the (probably) 20 - 100x speedup they already see over using a CPU.
For applications like machine learning, or molecular mechanics simulation, you could see a very real 10x speedup (if Nvidias numbers are to be believed). For applications like finite element analysis, or computational fluid dynamics, it's not going to help at all.