Abstract:
Alias analysis is a technique to determine whether a pair of pointers points to overlapping memory locations. Alias information (especially non-overlapping of pointers) is critical for enabling key transformations such as vectorization, redundant code elimination, loop-invariant code motion, and so on. Context-sensitive interprocedural alias analyses are generally more precise than intraprocedural analyses, but they are often not scalable for large, real-world programs. While context-insensitive interprocedural analyses are more scalable, they significantly sacrifice precision due to call-site merging and conservative assumptions. As a consequence, most of the production compilers sacrifice precision for scalability and implement intraprocedural alias analysis. Moreover, purely static alias analyses are often too conservative when aliasing depends on runtime inputs, calling contexts, or control-flow conditions, especially in cases where pointers overlap only for specific or rare executions. We first proposed a tool to estimate the upper bound on the performance of SPEC benchmarks in the presence of the most precise aliasing information. The key idea was to profile SPEC benchmarks to log alias information, use them to optimize the program, and obtain an upper bound on the likely performance improvement using the most precise alias analysis. Here, the upper bound is an empirical, input-dependent estimate of the performance improvement achievable through improved aliasing precision. We found that an execution time improvement of up to 11.56% is possible for the SPEC benchmarks. Additionally, we found that up to 53.32% execution time improvement is possible for Polybench benchmarks. Polybench benchmarks are used to evaluate the effectiveness of loop transformations. To improve the precision of alias analysis, several prior works propose combining code versioning with dynamic checks to preserve the scalability of alias analyses while selectively improving precision in performance-critical regions. These approaches do not increase alias analysis precision, instead, they use runtime checks to improve the effectiveness of alias analysis in enabling optimizations. These techniques are particularly effective when applied at the loop granularity, since loops often dominate runtime. However, most of these approaches are restricted to loops with either loop-invariant bounds or have very high overheads of dynamic checks, which limit their applicability. This thesis proposes two approaches to reduce the overhead of dynamic checks to constant time. The first approach constrains the allocation size and alignment of the memory objects using a segment-based allocator to reduce the overhead of the dynamic checks to constant time. We achieved a CPU performance improvement of up to 1.47% with a geometric mean of 0.55% for the SPEC benchmarks. The allocator introduced a maximum overhead of 8.36%, with a geometric mean of 1.47% across all SPEC benchmarks. For Polybench benchmarks, we achieved up to 51.11% CPU performance improvement. The allocator introduced a maximum overhead of 5.3%, with a geometric mean of 0.21% across all Polybench benchmarks. Our second approach reduces allocator overhead by selectively constraining memory allocations. We also proposed a region-based allocation strategy that eliminates the need for memory accesses in the dynamic checks. This approach never resulted in performance degradation and achieved improvements of up to 1.88% with a geometric mean of 0.58% on SPEC benchmarks. The maximum CPU overhead of our allocator is 0.57% with a geometric mean of -0.2% for SPEC benchmarks. Both our approaches outperform a previous approach to disambiguate pointers, which reported a 29% overhead for a SPEC CPU 2006 benchmark. The primary contribution of this thesis is the development of a dynamic disambiguation approach that can be applied to enhance the performance of real-world, largescale applications.