Programming Languages

Analyzing the Discipline of Preprocessor Annotations in 30 Million Lines of C Code

by Jörg Liebig, Christian Kästner, and Sven Apel

In Proceedings of the 10th ACM International Conference on Aspect-Oriented Software Development (AOSD), pages 191–202. ACM Press, 2011.

Abstract

The C preprocessor cpp is a widely used tool for implementing variable software. It enables programmers to express variable code of features that may crosscut the entire implementation with conditional compilation. The C preprocessor relies on simple text processing and is independent of the host language (C, C++, Java, and so on). Language independent text processing is powerful and expressive|programmers can make all kinds of annotations in the form of #ifdefs|but can render unpreprocessed code difficult to process automatically by tools, such as code aspect refactoring, concern management, and also static analysis and variability-aware type checking. We distinguish between disciplined annotations, which align with the underlying source-code structure, and undisciplined annotations, which do not align with the structure and hence complicate tool development. This distinction raises the question of how frequently programmers use undisciplined annotations and whether it is feasible to change them to disciplined annotations to simplify tool development and to enable programmers to use a wide variety of tools in the first place. By means of an analysis of 40 mediumsized to large-sized C programs, we show empirically that programmers use cpp mostly in a disciplined way: about 85\,% of all annotations respect the underlying source-code structure. Furthermore, we analyze the remaining undisciplined annotations, identify patterns, and discuss how to transform them into a disciplined form.