After decomposing the program into function instances and UPs, there still remain lines that are analyzed to be in conflict with another line. It is inevitable to maintain information at run time to determine which program line is currently cached and to update this information dynamically. This is achieved by maintaining a path state at execution time. A path state only reflects the conflicts local to the current path (in contrast to a cache state that comprises the global state of a cache memory). Consider the example in Figure 4, which will be discussed step-by-step in the following. Path 1 contains program line a which conflicts with program line x since both map into cache line c and line x can be reached from path 1. Thus, the shared path state (SPS) for path 1 keeps track of whether or not program line a is in cache. More detailed examples are given in [].
Naively, a path state may be kept on the most specialized level (for each function instance and path). But this would require a considerable amount of interaction between UPs. In the worst case, the execution of a UP of some function instance would not only have to update its path state but every other path state conflicting with a line of this path and any function instance.
![]()
Figure 4: Frequency Counters Indexed by the SPS
Cache state information is therefore merged after simulation in two stages to comprise path states. First, the conflicts of the cache states of a UP of all instances of a function are merged into one local path state (LPS). Second, local path states of neighboring UPs that share at least one instruction are merged into one shared path state (SPS). The latter is illustrated in Figure 4. The LPS of path 1 contains the conflicting program line a while the LPS of path 2 contains both conflicting program lines a and b. The two paths overlap in block 1 and 4. Thus, the SPS for paths 1 and 2 contains the program lines a and b. LPSs allow uniform instrumentation of code rather than distinguishing instances dynamically at every instrumentation point or replicating code for each instance. Both merging operations greatly reduce the overhead of dynamic simulation for conflicts. While a SPS only needs to maintain one state to keep track of conflicts dynamically, the state may comprise a wider range of values to combine all possible conflicts of overlapping paths.
Generated code is instrumented by inserting instructions at the unique transition of each UP to keep track of the SPSs and record the frequency of executed instructions for this path and state. At the exit points of the program, an epilogue is inserted to call a library routine, which calculates the total hits and misses from the gathered state-dependent frequencies.
The code emitted by the compiler back end includes macro calls for each UP and for each call site. The simulator generates the corresponding macros bodies, produces tables to store SPSs and frequency counters at run time, and provides other constant data structures for the final calculation of hits and misses.