next up previous
Next: Future Work Up: On Debugging Real-Time Applications Previous: Applications

Measurements

The environment discussed above was implemented for the SPARC architecture. It includes a modified compiler back-end of VPO (very portable optimizer) [], the static simulator for direct-mapped caches [], and the regular system linker and source level debugger DBX under SunOS 4.1.3. Calling a library routine to query the elapsed time takes a negligible amount of time in the order of one millisecond. Thus, this section focuses on measuring the overhead of cache simulation during program execution. The correctness of the instruction cache simulation was verified by comparison with a traditional trace-driven cache simulator. The execution time was measured for a number of user programs, benchmarks, and UNIX utilities using the built-in timer of the operating system to determine the overhead of cache simulation at run time. Table 1 shows programs of varying program size (column 3), the overhead of unoptimized code (column 4), and the support of virtual timing information through dynamic cache simulation (column 5-8) as a factor of the execution time of optimized code for cache sizes of 1kB, 2KB, 4kB, and 8kB.

On average, unoptimized programs ran 1.8 times slower than their optimized version. Running the optimized program and performing cache simulation to provide virtual timing information was on average 2.1 to 7.8 times slower than executing optimized code.gif In other words, the optimized code with cache simulation was roughly 1 to 4 times slower than the unoptimized code typically used for program debugging.

The cache size influences the overhead factor considerably which can be explained as follows: For small cache sizes, programs do not fit into cache and capacity misses occur frequently which requires the dynamic overhead of simulating program lines classified as ``conflicts''. For larger cache sizes, a larger portion of the program fits into cache reducing capacity misses and thereby reducing the number of ``conflicts''. Once the entire program fits into cache, no ``conflicts'' need to be simulated. Rather, frequency counters are sufficient to simulate the cache behavior. This reduces the overhead considerably.


next up previous
Next: Future Work Up: On Debugging Real-Time Applications Previous: Applications

Robert Palmer
Mon May 19 10:16:17 EDT 1997