Myths about static analysis. The third myth - dynamic analysis is better than static analysis.

While communicating with people on forums, I noticed there are a few lasting misconceptions concerning the static analysis methodology. I decided to write a series of brief articles where I want to show you the real state of things.

The third myth is: "Dynamic analysis performed by tools like valgrind for C/C++ is much better than static code analysis".

The statement is rather strange. Dynamic and static analyses are just two different methodologies which supplement each other. Programmers seem to understand it, but I hear it again and again that dynamic analysis is better than static analysis.

Let me list advantages of static code analysis.

Diagnostics of all the branches in a program

Dynamic analysis in practice cannot cover all the branches of a program. After these words, fans of valgrind tell me that one should create appropriate tests. They are right in theory. But anyone who tried to create them understands how complicated and long it is. In practice, even good tests cover not more than 80% of program code.

It is especially noticeable in code fragments handling non-standard/emergency situations. If you take an old project and check it with a static analyzer, most errors will be detected in these very places. The reason is that even if the project is old, these fragments stay almost untested. Here is a brief example to show you what I mean (FCE Ultra project):

fp = fopen(name,"wb");
int x = 0;
if (!fp)
int x = 1;

The 'x' flag will not be equal to one if the file wasn't opened. It is because of such errors that something goes wrong in programs: they crash or generate meaningless messages instead of adequate error messages.

Scalability

To be able to check large projects through dynamic methods regularly, you have to create a special infrastructure. You need special tests. You need to launch several instances of an application in parallel with different input data.

Static analysis is scaled several times easier. Usually you need only a multi-core computer to run a tool performing static analysis.

Analysis at a higher level

One of the advantages of dynamic analysis is that it knows what function and with what arguments is being called. Consequently, it can check if the call is correct. Static analysis can't know it and can't check arguments' values in most cases. This is a disadvantage of this method. But static analysis performs analysis at a higher level than dynamic analysis. This feature allows a static analyzer to detect issues which are correct from the viewpoint of dynamic analysis. Here is a simple example (ReactOS project):

void Mapdesc::identify( REAL dest[MAXCOORDS][MAXCOORDS] )
{
memset( dest, 0, sizeof( dest ) );
for( int i=0; i != hcoords; i++ )
dest[i][i] = 1.0;
}

Everything is good here from the viewpoint of dynamic analysis, while static analysis gives the alarm because it is very suspicious that the number of bytes being cleared in an array coincides with the number of bytes the pointer consists of.

Here you are another example from the Clang project:

MapTy PerPtrTopDown;
MapTy PerPtrBottomUp;
void clearBottomUpPointers() {
PerPtrTopDown.clear();
}
void clearTopDownPointers() {
PerPtrTopDown.clear();
}

Is there anything here dynamic analysis may find suspicious? Nothing. But a static analyzer can suspect there is something wrong. The error is this: inside clearBottomUpPointers() there must be this code: "PerPtrBottomUp.clear();".

Para obter mais informações sobre otimizações de compiladores, consulte Aviso sobre otimizações.