Pointer Checker to detect buffer overflows and dangling pointers (part 2)

Overview

 
A dangling pointer arises when you use the address of an object after its lifetime. This may occur in situations like returning addresses of the automatic variables from a function or using the address of the memory block after it is freed.
The dangling pointer manifests when a programmer erases the allocated object being pointed to by using runtime function free() or the delete() operator. Also, if the programmer erases or kills the pointer instance by setting it to NULL, there will be no issue because the program halts with segmentation fault if the NULL pointer is used. So, the freed pointer should always be set to NULL. If the developer forgets and keeps the pointer in the code creating a dangling pointer, which can be malaciously exploited, this leads to significant quality and safety issues. C/C++ compilers, unlike Java do not have provision for automatic garbage collection, which leads to memory leaks as well as dangling pointer problems. In languages where garbage collection is implemented the programmer does not need to worry about deleting variables. Therefore, the programmer does not need to worry about dangling references in languages where this mechanism is available, although this is very expensive. However, in C/C++, Fortran, Pascal etc, like buffer overflows, dangling/wild pointer bugs frequently become security holes. Otherwise in less extreme cases, the results may be unpredictable and inconsistent.
 
 
TOPIC
 

Dangling-pointer Pointer Checker

 
In this article continued from http://software.intel.com/en-us/articles/pointer-checker-to-debug-buffer-overruns-and-dangling-pointers , we explore how the Pointer Checker can be used to detect the existence of dangling pointers in your code.
 

To check dangling pointer, only need to use:

-check-pointers-dangling=[heap/stack/all] for heap, stack problems , or both of them, respectively.

We will look at a few simple examples to illustrate the Pointer Checker benefits to detect these bugs early in the program development lifecycle, which saves the time and effort in ordinary cases when checking is done manually.

//////library test6.cpp
char* newFoo (char * x) {
    char* tmp = (char*)malloc(strlen (x) +1) ; //   line 2
    strcpy (tmp, x);  //  line 3
   free(x);      // can be problematic to free memory for x here.  WHY?
 // If we pass (stack) array to newFoo, then free() on non-heap (stack) array (invalid address)will cause Segfault. 
    return tmp;
}

 

In above case, “x” passed to function may be a dangling pointer depending on whether it still points to allocated memory.  In below code, free(mystr) already frees the pointer and passes it to the function.

//////client6.cpp
 int main()
{
char * mystr=(char*)malloc(12);
strcpy(mystr,"Hello World");
free(mystr);
char* str=newFoo(mystr);
printf("str =%sn", str);
}

 

Output of the execution runs with different options for dangling-pointer checking:--

 

%icc test6.cpp
%./a.out
Str=
////// with dangling-pointers checking disabled  - no traceback
% icc test6.cpp -g -rdynamic -check-pointers=rw -check-pointers-dangling=none
%./a.out
Str=
//////////  -check-pointers=write – no traceback info  ..   WHY???
% icc test6.cpp -g -rdynamic -check-pointers=write -check-pointers-dangling=all
%./a.out
Str=
////////with dangling-pointers enabled  --  rw = write and read
% icc test6.cpp -g -rdynamic -check-pointers=rw -check-pointers-dangling=all
%./a.out
…………………………………………………………..
CHKP: Bounds check error
    lb: 0x2
    ub: (nil)
  addr: 0x516016
   end: 0x516016
  size: 1
Traceback:
    at address 0x4030d4 in function __chkp_strcpy
    in file unknown line 0
    at address 0x4028f1 in function _Z6newFooPc
    in file /home/cmplr/usr4/mkulka3/dang5.cpp line 6
    at address 0x402c32 in function main
    in file /home/cmplr/usr4/mkulka3/dang5.cpp line 20
    at address 0x21f68154 in function __libc_start_main
    in file unknown line 0
    at address 0x402619 in function __gxx_personality_v0
    in file unknown line 0
CHKP Total number of bounds violations: 14

 

The run output above is just a snapshot of the dangling pointer occurring in the user code.

 

CHKP: Bounds check error
    lb: 0x2

    ub: (nil)
  addr: 0x516016
   end: 0x516016
  size: 1

 

The bounds of the dangling pointers are actually set to, as in above section in CHKP

 

lower_bound(p) = 2;

upper_bound(p) = 0 (nil);

If your program gets a bound violation with these bounds, it is a reference through a dangling pointer. The compiler does this setting of lb , ub to mark them for dangling-pointers.

When dangling pointer checking is enabled for stack (above case was for heap), the compiler finds all pointers that point to the locals of the function and changes their bounds in the same way as heap pointers above, just before the function exits.

 

We see that lines 2 (strlen) & line 3 (strcpy) in library test6.cpp are both accessing (reading) the dangling pointer, which points to deallocated memory.

TIP 1:  To use - check-pointers-dangling=[heap/stack/all] , you need to use –check-pointers  option, else Intel(R) compiler will give warning, though will create binary.

TIP 2: In above code, you have to use –check-pointers=rw to get dangling pointers information.

If we use –check-pointers=write  with the dangling option, we do not get any traceback information.

It is because, in the library test6.cpp, we try to read-access the freed pointer, not to write it. So, we need –check-pointers=rw  . General tip is use -check-pointers=rw to include read-access.

TIP 3: Above is case when data is created on heap, so  -check-pointers-dangling=heap/all will detect the problem. 

 

Another example to illustrate hidden dangling-pointer issue which is discovered when we use simple printf (which is the simplest debugging statement/tool to detect and used in intial step quite commonly:--

//////////////  test7.cpp  -- simple list elements - addition & removal based on arguments to exe
#include <stdlib.h>
#include <stdio.h>
typedef struct elt {
int data;
struct elt* next;
} elt;
int main(int argc, char* argv[]) {
int i;
elt* list = NULL;
elt* p = NULL;
//////  Addition based on command-line
for (i = 1; i < argc ; i++) {
p = (elt *) malloc(sizeof(elt));
p->data = (int) argv[i][0];    // args from command-line
p->next = list;
list = p;
}
//////////  Deletion of list
while(list != NULL) {
p = list;
list = list->next;    ///      line 21
free(p->next);    /* Oops... should be p .  Should free p, not p-next, will cause dangling-pointers.
////printf("%dn",p->data);     ///  line 23                                                       
}
return 0;
}

Above is a simple code where list is generated from elements from command-line, and then list elements are removed subsequently.

%icc test7.cpp
%./a.out sun mon tue wed thu fri sat

When we run such a client-program, we may not get any crash, as seen in the above output but we get unexpected results. Also, we may not know until the point of crash or some Access Violation has occurred.

Suppose, we un-comment  line 23, to print the freed value, it results in immediate crash, as we were earlier not accessing or using the freed (now unallocated) data. Now, with simple printf, it tries to do something with freed data. So, printf invokes the dangling pointer trouble that was hidden. This also shows the basic level in which printf() releaved the hidden pointer issue.

mkulka3@dpd20:~> ./a.out mil sun
0
Segmentation fault
mkulka3@dpd20:~> ./a.out sun mon tue wed thu fri sat
0
5251232
5251200
5251168
5251136
5251104
Segmentation fault

 

As in above situation, running Pointer-checker enabled options will be more profitable to find the cause of problem rather than until the time when crash or unexpected results occur, which can then be difficult to debug at later stage.

 

Detecting Dangling pointer to the stack

 

Until now, we saw how objects created in heap are detected by dangling pointer checking, but Pointer Checker also checks for automatic and function variables that are created in stack. For eg., pointer which points to data local to function upon exit becomes a dangling pointer, as local non-static variable has only the lifetime of function call.

Consider the following simple example:--

 

//////////////////////test7.cpp
#include <stdio.h>
int *x;
  void p(){
      int y=10;
      x = &y;  // x is a global variable of type int*
   }
int main()
{
p();
printf("x is %d %lxn",*x,x);
return 1;
}

During a call to p, x is set to point to the address of y, which is on the stack. After p returns, y is deallocated and x therefore becomes a dangling pointer in the stack.
A dangling pointer to the stack is a very dangerous situation, which may lead to catastrophic and not easily detectable errors.

///////// plain compile
icpc -g test7.cpp   ////  gives expected result, still dangling pointer   ---  WHY?
./a.out
x is 10 7fff380498c0
///////// no traceback info  -- WHY??
icpc -check-pointers=rw -g -rdynamic test7.cpp -check-pointers-dangling=heap
./a.out
x is 10 7fff380498c0
 
icpc -check-pointers=rw -g -rdynamic test7.cpp -check-pointers-dangling=stack
./a.out
CHKP: Bounds check error
    lb: 0x2
    ub: (nil)
  addr: 0x7fff0514c9b0
   end: 0x7fff0514c9b3
  size: 4
Traceback:
    at address 0x400b55 in function main
    in file /home/cmplr/usr4/mkulka3/dang11.cpp line 11
    at address 0xa6118154 in function __libc_start_main
    in file unknown line 0
    at address 0x4009c9 in function __gxx_personality_v0
    in file unknown line 0
x is 10 7fff0514c9b0
CHKP Total number of bounds violations: 1

 

lb, ub as discussed earlier are set to appropriate values as dangling-pointers are marked by Pointer Checker. size:4 indicates that it is an integer data on stack.

 

Here, we note that the output is correct as expected here, though it was dangling pointer. This is because the deallocated stack data is still pointed to by pointer, but since it is returned to free pool, the location can be re-used or re-allocated further, in which case the program will give unexpected result.

 

Following example is an add-on to above stack example, which illustrates the case of trashing up the stack demonstrating a case of incorrect result output.

 

///////////////  test8.cpp  -- enhancement of test7.cpp to illustrate inconsistent result.

 

#include <stdio.h>
int *x;
  void p(){
      int y=10;
      x = &y;  // x is a global variable of type int*
   }
void trash() { /* mess up stack */
   int i = 100;
   int j = 200;
}
int main()
{
p();
trash();
printf("x is %d %lxn",*x,x);
return 1;
}

 

We added trash() function to mess up the stack data that x was pointing to, which is now dangling. Now, the stack data pointed to by x will now be re-used by function call trash() .

 

Below are few inconsistent and random results with different compilers:--

 

icpc -g dang11.cpp
./a.out
x is 100 7fff2ec2fd10
./a.out
x is 100 7fff8cd48e20
g++ -g dang11.cpp
./a.out
x is 200 7fff153c04bc
./a.out
x is 200 7fff45a15b0c

 

So, in a nutshell, the verdict is that the dangling pointer can be exploited in any manner and different ways so that your program can work differently each time, and it will be difficult to dig out this behavior when huge code sizes are involved.

Pointer Checker feature pushes the envelope in detection and resolution of such nasty paradigm bugs, resulting in less error-prone code. Instead of trying a brute force method or adjusting by hit-and-trial using several ways that don't work correctly, the programmer directly resolves the immediate cause of bug, and absolves himself and his inheritors of code from the sin of treating symptoms and side-effects. Pointer Checker can be used effectively as a miracle cure in the detection and troubleshooting of such errors rather than going for a band-aid fix that can go wrong on account of Murphy's law.

 

Undimensioned arrays checks

 

Suppose we have some code where we have used Pointer-checker options. However, dimensioned array definition (example:  extern int a[] )happens to be defined in a non-pointerchecker-enabled library. The solution for such a scenario is to turn off un-dimensioned checks by  

using –check-pointers-undimensioned (Linux*) or /Qcheck-pointersundimensioned (Windows*) and complete the build.

A similar approach can be used in scenarios where, for example, an array "a” is defined in one module with a dimension (say 100), again redefined in another module with a dimension (say 200), and referenced as an un-dimensioned array “a[]” in a different module, which C-language does allow, although it’s not a standard approach.

Compiling such an application enabled for checks for un-dimensioned arrays would result in "multiple definition of ‘a’. In such a scenario, the solution is to turn off un-dimensioned arrays checks for relevant files when enabling with Pointer Checker for read or write operations.

 

Guidelines to use Pointer-Checker

  • When using Pointer Checker, use debug configuration (with –g option on Linux and /Zi on Windows) for testing and debugging so symbols are seen for better trace-back functionality. Use the –rdynamic linker option  when compiling on Linux, so function names are output in the trace-back. And, compile with no optimization to avoid optimizing away memory accesses and also improve source code correlation.
  • First catch OOB errors for write operations and then for read, since writes are more critical and can cause severe software vulnerabilities.
  • Release your application with the Pointer Checker option disabled, as application size and execution time increases with Pointer Checker enabled applications. Runtime cost is high, about 2X-5X the execution time (based on some open source application runs - work in progress and not yet published), and code size increases from 20 percent to 100 percent or more depending on the application.

Few takeaways

  • A pointer checked-enabled application will catch out-of-bounds memory accesses before memory corruption occurs, in code-paths that are executed actually, and with Pointer-Checker option enabled. The user is recommended to be familiar with the application code, and use representative data sets to trigger pointer issues for both buffer overflows and dangling pointers.
  • To use this new Pointer Checker feature, you will need a valid license to the Intel® Parallel Studio XE 2013 or the Intel® C++ Studio XE 2013 products. 

Resources in IDZ

 

A very nice and interesting resource which covers several use cases, examples and trace reporting APIs for good reporting exists in following article which is a part of Parallel Magazine in Intel Software Forum. This whitepaper is authored by Ganesh Kittur, from Intel Software Group.

 

http://software.intel.com/sites/products/parallelmag/singlearticles/issue11/7080_2_IN_ParallelMag_Issue11_Pointer_Checker.pdf

 

This article covers more usage models, for example, how to use Pointer Checker enabled code with non-enabled libraries or modules, and make it work by using set of APIs defined in Pointer Checker library.  So that the code enabled with Pointer Checker can inter-work with non-enabled modules without any issues related to pointer mismatch due to mixing things.

 

A very basic level technical presentation for a brief overview of capabilities, runtime APIs . intrinsics, with few simple examples of Pointer Checking can be found at:--

 

http://software.intel.com/sites/default/files/m/d/4/1/d/8/Pointer_Checker-Webinar.pdf

 


 

 

Reportez-vous à notre Notice d'optimisation pour plus d'informations sur les choix et l'optimisation des performances dans les produits logiciels Intel.