According to that, one of the biggest issues for Ryzen in gaming (and some other workoads) is actual extremely high latency and low bandwidth between the CCXes, which is exacerbated in moderately threaded situations by Windows 10 regularly moving threads between cores. If a thread gets moved and its data is now in the other CCX's L3, it'll end up with a cache miss and a huge latency penalty getting that data back in.
Assuming 4C Ryzen works by completely deactivating one CCX (which seems logical given the halving of L3 cache as well) that won't be a problem for it - there won't be another CCX for threads to get migrated to. So part of the problem may be mitigated inherently by the method of harvesting dies...!
EDIT: looking at the SMT scaling you posted, it looks like Civ and GTA V are least affected, which I believe are the most CPU intensive games in that list? That would make sense if Windows 10 only moves threads in situations where cores are lightly loaded - put lots of load on the cores and no thread movement so no cache misses; lightly load the cores, more thread movement, more cache misses. That'd be easily fixable in driver or scheduler - simply tell the scheduler not to move active threads...!