In the past few years, research in the area of predicting the WCET of programs has intensified. Conventional methods for static analysis have been extended from unoptimized programs on simple CISC processors to optimized programs on pipelined RISC processors [12, 6], and from uncached architectures to instruction caches [2, 10, 8] and data caches [9, 11].
Kim et al. [9] have recently published work about bounding data cache performance for calculated references, which they refer to as occurring from dynamic load and store instructions. Their approach uses a version of the pigeonhole principle. For each loop they determine the maximum number of references from each dynamic load/store instruction. They also determine the maximum number of distinct locations in memory referenced by these instructions. The difference between these two values is the number of data cache hits for the loop given that there are no conflicting references. This technique efficiently detects temporal locality within loops when all of the data references within a loop fit into cache and the size of each data reference is the same size as a cache line. Their technique at this time does not detect any spatial locality (i.e., when the line size is greater than the size of each data reference and the elements are accessed contiguously) and detects no temporal locality across different loop nests. Furthermore, they cannot currently deal with compiler optimizations that alter the correspondence of assembly instructions to source code. Such compiler optimizations can make calculating ranges of relative addresses significantly more challenging.
Li et al. [11] have described a framework to integrate data caching into their integer linear programming (ILP) approach to timing prediction. Their implementation performs data-flow analysis to find conflicting blocks. However, their linear constraints describing the range of addresses of each data reference currently have to be calculated by hand. They also require a separate constraint for every element of a calculated reference causing scalability problems for large arrays. No WCET results on data caches are reported. The ILP approach facilitates integrating additional user-provided constraints into the analysis.
The possibility of an extension of Park's timing schema for set-associative caches is briefly mentioned in [12] However, it has not been formalized nor has an implementation been described or any results been reported. The integer linear programming (ILP) approach [11] has been extended to support timing predictions for set-associative instruction caches. An automaton describes transitions between cache states for each set of conflicting blocks and additional cache constraints describe the problem on the ILP level. The results only estimate the number of cache misses, which makes a comparison with our results difficult since we predicted both cache hits and misses. Their results indicate a much higher overhead of the ILP approach (up to hours) compared to our methods (up to seconds). It is not clear how their approach scales in general with changing associativity.