There's an ex-Intel person who sometimes post on AT:It seems pretty clear from the numbers of failures from the people running these 24/7 Essentially the increased operating time is giving us insight into the future.
Page 54 - Intel processors crashing Unreal engine games (and others)
Page 54 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
forums.anandtech.com
That RPL was rushed is now in little doubt, but if the last scenario is anywhere to being close to what happened?No, this is obviously voltage induced stress that Intel was supposed to catch by running accelerated aging simulations in the racks. Here is how that works the design team (not the fab folks) design an aging simulation by running a power virus at elevated temps (~120C) and elevated voltages. This protocol should have exposed any issues related to the grotesque over-voltage that Intel has deployed in recent years in an attempt to stay competitive in benchmarks.
That leaves some possible scenarios:
- Raptorlake was rushed out so quickly they didn’t even have enough time to run the whole aging simulation: seems unlikely since it only takes 6-8 weeks
- The aging simulation is broken: how? The content should not have changed… the corners should be well above what is productized.
- They didn’t even bother running the aging simulation: that would be beyond irresponsible.
- They ran the simulation and found issues but just either ignored it due to time-to-market or competitive pressure, or management told the engineering team to swag an operating point where the issues wouldn’t crop up. Problem with that is the aging simulation is a statistical exercise: they likely wouldn’t have enough parts and rack time to re-verify at the actual swagged voltage: the entire point of the aging simulation is to go well beyond the actual points and tease out the errors without having to run millions of parts for months on end. Management must have known this and brought the part to market anyways.
Well, that's like the "Intel fired their server validation team"* story: hard to believe.
Apparently Pat G will fix all this. However wasn't RPL on his watch? And for that matter wasn't he there in the Athlon days when Intel did all those antitrust OEM stuff - although maybe in a role without any say on that?
* truth seems to have been that the attempted to move them all one site and lost tons of personal with years of experience.
Last edited: