Bug hunting with static analysis tools


The best way to spend less time fixing bugs in software is not creating bugs in the first place, and a static analysis tool is perfect for this job.

The saying goes that in the software development process, we pass 50% debugging, and the other 50% bugging! A joke, but with an element of truth.

We are human and we make mistakes. Invariably, we will end up adding some bugs in the code. Therefore, we need tools that minimize the likelihood of these errors happening. This is why we need static analysis tools.

Static analysis tools are able to analyze the source code (without running the program) to find problems before they happen. In C/C++ programs, these tools can find program errors like null pointer dereferences, memory leaks, division by zero, integer overflow, out of bounds access, use before initialization, etc! Without running the program, just inspecting the source code!

But before talking about these tools, did you know that your compiler also does static code analysis?

Static analysis in a compiler

The objective of the compiler is to generate the executable/binary file. So they are not focused on static code analysis by default.

But over time they are improving and getting better at static code analysis. And the result is more warnings when we compile our code.

We could say that the warnings generated by the compiler are very annoying, and most messages are not critical or will not cause any problems to the application. I would argue that we should trust the compiler. Because one thing is for sure: no one understands better the language, its syntax, and semantics than the compiler itself.

So we must take some time to check and fix the warnings. It is much less costly than wasting several hours to find a problem that a tool can automatically find for you.

For example, take a look at the code below. Do you think it’s going to print “ON” or “OFF”?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
#include <stdio.h>

#define ON  0xFF
#define OFF 0x00

void print_message(char status)
{
    if (status == ON)
        printf("ON\n");
    else
        printf("OFF\n");
}

int main(int argc, const char *argv[])
{
    print_message(ON);

    return 0;
}

Well, looks like it will print “ON”, right? Wrong! It will print “OFF”!

$ gcc main.c -o main
$ ./main
OFF!

If the source code is compiled with Clang, we will find the bug right away!

$ clang main.c -o main
main.c:8:16: warning: comparison of constant 255 with expression of type 'char' is always
      false [-Wtautological-constant-out-of-range-compare]
    if (status == ON)
        ~~~~~~ ^  ~~
1 warning generated.

Clang has a very good static analyzer. GCC is not that good, but can also find the bug compiling with -Wall and -Wpedantic (although the warning message is not that friendly).

$ gcc main.c -o main -Wall -Wpedantic
main.c: In function ‘main’:
main.c:3:13: warning: overflow in implicit constant conversion [-Woverflow]
 #define ON  0xFF
             ^
main.c:16:19: note: in expansion of macro ‘ON’
     print_message(ON);
                   ^

So we should pay attention to all compiler warnings. Going further, we need tools that generate more warnings, not less. Tools that can analyze the source code and find strange constructions, potential problems and unusual uses of the language.

As you can see, some compilers like Clang and GCC are reasonably good at static analysis. But that is not their main objective. Compilers are not required to emit any warning by the C/C++ standard. But they do so because it’s helpful for developers and it’s not a big deal to implement. The warnings are a byproduct of simple checks done during the compilation process.

Also, static code analysis takes a significant amount of time and is usually not needed for every single compilation. Compilation time is very important for a compiler. And there is a trade-off between checking code quality and compilation time.

That is why we have specific tools for static code analysis. These tools are also called lint, a term originated from a Unix utility that examined C language source code.

A list of static code analysis tools is available in the Wikipedia. And one of the best open source static analysis tools is cppcheck.

What is cppcheck?

Cppcheck is a static analysis tool for C/C++ code. It provides unique code analysis to detect bugs and focuses on detecting undefined behavior and dangerous coding constructs, including:

  • Dead pointers
  • Division by zero
  • Integer overflows
  • Invalid bit shift operands
  • Invalid conversions
  • Invalid usage of STL
  • Memory management
  • Null pointer dereferences
  • Out of bounds checking
  • Uninitialized variables
  • Writing const data
  • Unused or duplicated code

Cppcheck is an open source project, currently hosted on Sourceforge and GitHub, with support for GNU/Linux, Windows and Mac OS operating systems.

Installing cppcheck

Procedures for installing cppcheck are available on the project’s website.

You can get cppcheck from your Linux distribution package manager (although you might get an outdated version). For example, on a Debian based distribution, you could run the following command to install cppcheck:

$ sudo apt install cppcheck

If you want to use the last cppcheck version, you may need to download the source code, compile it and install in your machine:

$ wget https://github.com/danmar/cppcheck/archive/1.90.tar.gz
$ tar xfv 1.90.tar.gz
$ cd cppcheck-1.90/
$ make MATCHCOMPILER=yes FILESDIR=/usr/share/cppcheck HAVE_RULES=yes -j4
$ sudo make MATCHCOMPILER=yes FILESDIR=/usr/share/cppcheck HAVE_RULES=yes install
$ cppcheck --version
Cppcheck 1.90

Running cppcheck

The source code below has a function to calculate the sum of all the elements of an integer vector. Can you find two bugs in this code?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
void calc(void)
{
    int buf[10] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
    int result;
    int i;

    for (i = 0; i <= 10; i++) {
        result += buf[i];
    }
}

int main(void)
{
    calc();
    return 0;
}

Well, GCC 5.4 can’t find any bug in this code!

$ gcc -Wall -Wextra -Werror -Wpedantic main.c -o main
$ ls main
main

Clang is able to find one of the bugs:

$ clang -Wall -Wextra -Werror -Wpedantic -Weverything main.c -o main
main.c:8:9: error: variable 'result' is uninitialized when used here [-Werror,-Wuninitialized]
        result += buf[i];
        ^~~~~~
main.c:4:15: note: initialize the variable 'result' to silence this warning
    int result;
              ^
               = 0
1 error generated.

And cppcheck is able to find both bugs!

$ cppcheck main.c
Checking main.c ...
main.c:8:22: error: Array 'buf[10]' accessed at index 10, which is out of bounds. [arrayIndexOutOfBounds]
        result += buf[i];
                     ^
main.c:8:9: error: Uninitialized variable: result [uninitvar]
        result += buf[i];
        ^

Your compiler is good, but a static analysis tool is better at fiding bugs in the code. So you should always use a static analysis tool. The more tools you use the better!

Running cppcheck on Busybox

Why don’t we try to run cppcheck on a larger open source project like BusyBox:

$ wget https://busybox.net/downloads/busybox-1.31.1.tar.bz2
$ tar xfv busybox-1.31.1.tar.bz2
$ cd busybox-1.31.1/
$ cppcheck . 2>&1 | tee cppcheck.log
...
$ cat cppcheck.log | grep error | wc -l
146

Wow! More than 140 possible bugs in the most current version of Busybox (as I write this article). Some errors are false positives but several are open to analysis.

$ cat cppcheck.log | grep "Uninitialized variable"
archival/libarchive/bz/blocksort.c:1034:20: error: Uninitialized variable: origPtr [uninitvar]
archival/libarchive/bz/compress.c:235:18: error: Uninitialized variable: ll_i [uninitvar]
archival/libarchive/bz/compress.c:679:20: error: Uninitialized variable: origPtr [uninitvar]
archival/libarchive/decompress_bunzip2.c:165:20: error: Uninitialized variable: runCnt [uninitvar]
console-tools/loadfont.c:146:6: error: Uninitialized variable: height [uninitvar]
...

$ cat cppcheck.log | grep "out of bounds"
util-linux/fdisk_sgi.c:138:10: error: Array 'freelist[17]' accessed at index 17, which is out of bounds. [arrayIndexOutOfBounds]
util-linux/fdisk_sgi.c:138:10: note: Array index out of bounds
util-linux/fdisk_sgi.c:139:10: error: Array 'freelist[17]' accessed at index 17, which is out of bounds. [arrayIndexOutOfBounds]
util-linux/fdisk_sgi.c:139:10: note: Array index out of bounds
util-linux/volume_id/iso9660.c:114:15: error: Buffer is accessed out of bounds: hs->id [bufferAccessOutOfBounds]
...

$ cat cppcheck.log | grep "Resource leak"
scripts/kconfig/confdata.c:376:4: error: Resource leak: out [resourceLeak]

What people normally complain about static analysis tools is the excessive amount of warnings generated, many of them being false positives. But the tool can be configured and trained. This process could take a while, but the rewards are great in the end.

Extending cppcheck

Cppcheck can also be extended by creating new check rules with regular expressions or even through modules written in Python.

And there are cppcheck plugins for several popular development tools like Eclipse, Visual Studio, Code::Blocks, Sublime Text and QtCreator.

cppcheck eclipse plugin

If you are a vim fan, there is the Syntastic plugin:

cppckeck vim syntastic plugin

The cppcheck manual is available in PDF format on the project’s website.

Static code analysis is always the result of a trade-off between the quality of real bugs it finds and the number of false positives, and cppcheck is well balanced, extremely useful and simple to use.

You should really try to integrate cppcheck or another static analysis tool into your application’s development workflow. It won’t solve all of your problems, but it will certainly help to improve the quality of your application’s source code and decrease the time you spend fixing bugs.

About the author: Sergio Prado has been working with embedded systems for more than 25 years. If you want to know more about his work, please visit the About Me page or Embedded Labworks website.

Please email your comments or questions to hello at sergioprado.blog, or sign up the newsletter to receive updates.


See also