Optimization in Compilers: Challenges, Interdependencies, and Experimental Insights
By Krisela Skenderi & Thomas Flaig
Optimization in Compilers: Challenges, Interdependencies, and Experimental Insights
By Krisela Skenderi & Thomas Flaig
Over half of a typical compiler's source code is dedicated to optimization-related functions. These optimizations, designed to improve code efficiency, are challenging to verify against a specification. Experiments indicate that advanced compiler optimizations are not as reliable as commonly assumed, and software “bugs” due to optimizing transformations have been found in many existing programs, impacting security and performance.
The most common problem related to optimizations is undefined behavior exploitation: Some compilers or new versions of old compilers assume that undefined behavior does not exist in conforming C programs, and apply optimizations that cause the compiled program to not work as expected in higher optimization levels such as -O2.
Therefore, it is crucial to ensure that compiler optimizations are tested thoroughly enough by the test suite when performing compiler qualification. Unfortunately, commercial test suites simply verify conformity with the ISO C standard, without providing information about the specific compiler optimizations that have been tested.
One approach to ensure that the behavior of production code is not influenced by optimizations is to run the test suite and decide afterwards which optimizations should be disabled in the production run. This is done by checking which optimizations were not triggered by the test suite, and therefore are considered as not tested. The untested optimizations are disabled by changing the compilation use case of the program (the set of compiler and linker flags) to either remove the flag related to the optimization, or to add its negative form.
Compiler optimizations are implemented using a sequence of sophisticated transformations that turn a program into a semantically-equivalent one. Certain optimization transformations make other types of optimizations applicable to the transformed code. The theoretical dependency between atomic optimizations and groups of optimizations is extensively explored in literature, an example of which is the relationship between Strength Reduction and Dead Code Elimination. In some cases, strength reduction can introduce dead code, which is then eliminated by subsequent dead code elimination optimizations.
The results of the experiment described in the article “Importance of Use Case during Compilation”, we observe a case in which disabling one atomic optimization results in a similar optimization being performed 69.5% times more than in the reference -O2 case.
A specific test case that triggers multiple optimizations is selected from the test suite. The optimizations triggered by it are illustrated in the table below. The -fno-early-inlining flag is used to disable the early inlining of small functions optimization.
The machine and assembly code is generated for this test case, for both cases when the Early Inlining of Small Functions optimization is enabled and disabled. This is done with the command objdump –disassemble testcase.o for the two object files obtained after the compilation process. The files are then compared. This comparison allows us to observe the similarity in the generated code, in which the majority of differences are in register names, while the machine instructions remain the same.
The results of this experiment for one specific test case show that manually disabling an atomic compiler optimization (through the negative form of its flag) does not actually prevent the optimizing transformation from taking place.
An analysis of the optimizations triggered by the test suite is required after disabling an option, to ensure that the intended optimization is indeed disabled as expected. Relying solely on the common approach described in the Introduction is not feasible.
X. Wang, N. Zeldovich, M. F. Kaashoek, and A. Solar-Lezama. Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior. In: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles. SOSP ‘13. Farminton, Pennsylvania: Association for Computing Machinery, 2013, pp. 260–275. doi: 10.1145/2517349.2522728.
International Organization for Standardization. ISO/IEC 9899:2018(en) – Information technology – Programming languages – C. International Organization for Standardization. ISO/IEC 9899:2018(en). 2018.
S. S. Muchnick. Advanced Compiler Design and Implementation. San Francisco, CA: Morgan Kaufmann, 1997.
Free Software Foundation. GCC Documentation. 13.1.0. Free Software Foundation. https://gcc.gnu.org/onlinedocs/, 2023.