Systematic Testing of Multithreaded Applications

As multi-core CPUs are becoming mainstream, developing reliable multithreaded applications is increasingly becoming important. One vital issue in this trend is the testing of multithreaded applications. Due to inherently nondeterministic nature of concurrent programs (caused by nondeterministic thread schedules), program errors resulting from unintended thread interleavings  can be extremely difficult to track down by testing.

While testing a sequential program requires to “cover” the input space of the program, testing a multithreaded program requires to simultaneously “cover” both the input space and the space of possible thread interleavings. In theory, to prove that a multithreaded program behaves as expected for a given concrete input, one must validate that the program behaves as expected under all possible thread interleavings.

Many approaches have been proposed in the literature to systematically explore thread interleavings for a given program input. To reduce the space of possible interleavings, these approaches assume that the program under test strictly follows the mutual-exclusion locking principle. This principle states that each shared variable is associated with at least one mutual exclusion lock, and that the lock or locks are always held whenever any thread accesses that variable. Existing approaches then treats each critical section (i.e., code segments guarded by the locks) as an atomic block and exhaustively explore all possible orderings of atomic blocks, aiming to identify program errors resulting from unintended orderings. Note that without the assumption of strict adherence to the mutual-exclusion locking principle, the number of possible thread interleavings grows exponentially in the size of the program, since a thread interleaving can potentially occur after executing each machine instruction, whereas with the mutual-exclusion assumption, the space of interleavings grows exponentially in the number of atomic blocks.

Even with the drastic decrease in the number of thread interleavings to be explored, existing approaches still suffer from severe scalability issues mainly caused by the exponentially growing space for exploration. We conjecture that the scalability, efficiency, and effectiveness of the existing approaches can significantly be improved, if the exploration is performed in a goal-oriented fashion.