Back in 1996 for my final degree project (system architecture), I wrote a superscaler pipeline simulator that could take a 8086 program (assember), and execute this over 2 simulated 8086 CPU's (MIMD - Multiple Instruction, Multiple Data). It was in fact a dual core 8086, and all clock cycles were based on real 8086 timings.
It could execute the single 8086 assember program (that you could program yourself in a text editor), and using 'out of order execution' would look for sections of code (in the prefech buffer) that could be done out of order. When it found these sections it would execute them in parallel on the other core, thus making a single thread execute over 2 cores. When the section was executed it would store the results, when other sections had executed (in other core), it would look at the value(s) it calculated, and return them immediacy from cache. The main thread of the program was none the wiser as final program output was the same.
As well as parallelization from 2 cores, parallelization also happened within the pipelines (IFETCH, IDECODE, IEXECUTE, IINTERUPT), so an instruction could be in the IDECODE part of the pipeline, while a next instruction is in the IFETCH section etc. Depending on the assember program the pipeline would be pritty full, however if you had a program that would not scale well then the pipeline would not be filled up. There was also situations the second core was not used, again only dependent if a section of code could be done out of order.
To test everything you could turn functions on/off to the point it was a basic 8086, and see variable and register results in seperate windows. When the program ran it would count the clock cycles being used. You knew it was working when the clock cycles changed for different settings, but the results were the same.
All the above was written in Visual Basic 4, I got an A for the project. Work in software, but this was the closest I ever got to chips, lol
I remember talking to the Systems Architecture lecture and we talked about how the Mhz speed was being reached, and in the future computers would have to go multi CPU. We also use to talk about the law of deminising returns, and less gain the more you try to execute in parallel. This was 1996 remember.
When I was doing research for the project I looked into Transputes, also the out of order execution is not new and first appeared on IBM mainframes in the 60's.
It could execute the single 8086 assember program (that you could program yourself in a text editor), and using 'out of order execution' would look for sections of code (in the prefech buffer) that could be done out of order. When it found these sections it would execute them in parallel on the other core, thus making a single thread execute over 2 cores. When the section was executed it would store the results, when other sections had executed (in other core), it would look at the value(s) it calculated, and return them immediacy from cache. The main thread of the program was none the wiser as final program output was the same.
As well as parallelization from 2 cores, parallelization also happened within the pipelines (IFETCH, IDECODE, IEXECUTE, IINTERUPT), so an instruction could be in the IDECODE part of the pipeline, while a next instruction is in the IFETCH section etc. Depending on the assember program the pipeline would be pritty full, however if you had a program that would not scale well then the pipeline would not be filled up. There was also situations the second core was not used, again only dependent if a section of code could be done out of order.
To test everything you could turn functions on/off to the point it was a basic 8086, and see variable and register results in seperate windows. When the program ran it would count the clock cycles being used. You knew it was working when the clock cycles changed for different settings, but the results were the same.
All the above was written in Visual Basic 4, I got an A for the project. Work in software, but this was the closest I ever got to chips, lol
I remember talking to the Systems Architecture lecture and we talked about how the Mhz speed was being reached, and in the future computers would have to go multi CPU. We also use to talk about the law of deminising returns, and less gain the more you try to execute in parallel. This was 1996 remember.
When I was doing research for the project I looked into Transputes, also the out of order execution is not new and first appeared on IBM mainframes in the 60's.