next up previous
Next: Analytical Bounding of Execution Up: Applications Previous: Applications

Fast Instruction Cache Analysis

Instruction cache analysis can be performed on-the-fly in an efficient manner simulating much of the caching behavior statically. The simulation is performed for the function-instance graph of the program and a small UPPA for each function. Thus, a small set of measurement points can be determined. The program can be instrumented with measurement code on a unique edge/vertex within all UPs. The measurement code consists of frequency counters for each UP and state transitions similar to a deterministic finite automaton to simulate ``conflict'' instructions whose caching behavior cannot be determined prior to execution time. This keeps the measurement overhead fairly low.

During program execution, counters record the frequency of each instruction. At program termination, the total number of cache hits and misses is determined by relating the frequency with the instructions of a UP and their categorization. First misses are initially counted as hits. Then, the set of first miss lines is correlated with the corresponding frequency counters and if any counter is greater than zero, the total hit count is reduced by one while the miss count is incremented by one. For group first misses (see discussion of Figure 4), only one first miss is counted as a miss.

The implementation of the described method showed that the execution time of the instrumented program takes about twice the time of the uninstrumented program []. Other methods to perform on-the-fly instruction cache analysis result in an overhead of 6 to 16 at best [].



Robert Palmer
Mon May 19 10:38:07 EDT 1997