I am wondering what exactly is the architecture that is present by default in the Cell Simulator?
I want to know, how is the MMU configuration to resolve one simple problem.
I have written a program of simple DMA(get or put) by the SPU.
I measure the timing of just the DMA using prof_clear(), prof_start() and prof_stop() commands.
If the first time the data section variable array is accessed by the SPU to perform the DMA, the time I record is 26k-cycles in the statistics.
I am assuming that this is due to a TLB miss that has occurred.
In the second run, I first initialize the array on PPU and the SPU performs the DMA. The time recorded is around 500 cycles.
So I think, that the TLB has hit and hence this time.
However, architecture document says that there is MMU per SPU. So this behaviour should not happen?
Can anyone please let me know, where can I find out how is the architecture of Cell Simulator?