OK. Hmm.
Theoretical GFLOPS have always been a bit meaningless, but it's become a lot worse recently now that the high-core-count parts will adjust their clock speeds a lot depending on the workload.
So while the E7-8890 v4 has a base clock speed of 2.20 GHz (which in theory would yield an incredible 1690 SP GFLOPS), it would almost certainly reduce its clock speed when handling a high performance AVX code.
Also using theoretical numbers isn't much good for showing the difference between say Haswell and Skylake. Both have the same 32/cycle, similar core counts and clock speeds, but Skylake has various improvements that give it better benchmark performance that you won't see in theoretical peak numbers.
Not sure what to recommend, sorry. Curious what you come up with though.