That seems to make sense, I assume that the mixed precision stuff they talked about wouldn't allow then to use the 1*64bit units to output 1*32bit results and that it is related to something else?
Nah, the DP unit seems to be pure 64bit work. Only the SP units seem capable of performing two 16bit int instructions perclock. I believe the mixed precision comes from the ability of the GPU to run 16, 32 and 64 bit instructions simultaneously. Although it could just be talking about the SP units being capable of 16bit int instrucitons.