Pointer Checker to Debug Buffer Overruns and Dangling Pointers (Part 1)

Article Topic

Pointer Checker to debug buffer overruns and dangling pointers

Next Article

Pointer Checker to detect buffer overflows and dangling pointers (part 2)

Overview

A buffer overflow or overrun is a program anomaly caused when memory is accessed or written outside the region allocated for it, which may be inside the pre-allocated memory of the whole program, or outside the program itself when it overflows outside of the range of pre-allocated memory of stack or program. Though in protected mode, the OS & other applications are protected from the overruns from other applications. The present C/C++ front-ends do not actually have any built-in mechanism to do any boundary checking with regards to arrays or pointers. It depends on the OS to ensure that you are accessing valid memory.

There are several times when the code does not crash or segfaults, and at the critical time to market, the unexpected happens. So, if program works correctly for all the thorough testing stages, developer may think all is right. Or ifthe developeraccidentally discovers somehow an Out-Of-Bounds error in code, he/she is left wondering why the code does not crash. And if it does crash, the developer may not bother about fixing the issue, depending on the the complexity and cost incurred, leaving the system vulnerable to security breach or malicious attack. There may be a few situations which could be problematic:

  • The program may work correctly most of the time
  • The program may result in unexpected behavior – when accesses are done within the memory that program has allocated
  • Segfault – when accesses are made outside the program allocated memory.

To illustrate, Consider a simple program

//////test1.cpp

#include 
 
#include 
 
#include 
 
int main()
 
{
 
char a[8];
 
char*b=(char*)malloc(4);
 
strcpy(a,"aaaaaaaaaa");
 
strcpy(b,"bbb");
 
printf("a=%s  b=%sn",a,b);
 
return 0;
 
}
  
 
%icc test.cpp
 
%./a.out
 
a=aaaaaaaaaa  b=bbb

Until now , the user thinksthe codeworks perfectly, as judged by the results. And we do get the expected result, even when the allocated bytes are less than the written ones.

Now if we replace the allocation with malloc() with a simple array b[4], the result is different.

////////////test2.cpp
 
char a[8];
 
char b[4];
 
strcpy(a,"aaaaaaaaaa");     //////line 10
 
strcpy(b,"bbbb");
 
printf("a=%s  b=%sn",a,b);
 
return 0;
 
 
However, as in above code if we allocate b on different memory area, i.e, stack instead of heap, the result is:--
 
  
%icc test2.cpp
 
%./a.out 
 
a=aaaaaaaabbb  b=bbb

In both the above, the purpose of developer is same, but his/her implementation is different.

The problem for both is that C/C++ compilers do not do any boundary checking for any memory type, probably due to memory checking overheads and the housekeeping involved.

In both the cases, intent was the same for both the strings, but in 1st case, the output looks to be correct, as “b” memory area of heap could not overwrite the array "a" on stack. Though the output is correct, there is a potential hidden flaw that can go undetected risking data corruption and increasing vulnerability to errors, which may show if some slight change is done.

In the 2nd case, you are declaring both as stack based arrays. Depending upon the particular implementation by user, accessing outside the bounds of the array will simply access another part of the already allocated stack space or another stack variable. which happens here when both “a” & “b” are allocated on the stack and “b” overwrites “a”. Because “a” was shorter by few bytes, which could be taken care if size is increased to 11.

Topic

Pointer Checker

The Intel Composer XE 2013 product release contains Pointer Checker, a new testing and debugging feature to help find such buffer overruns occurring in Windows* OS or Linux* OS based applications. The Pointer Checker, performs bounds checking—providing full checking of all memory accesses through pointers—and identifies any out-of-bounds access in Pointer Checker-enabled code.

In this article, we will explore a few simple scenarios that can yet occur in a complex code environment.

We will be exploring compiler options and reports with some small examples showing different types of pointer problems that can occur. The purpose is to present how we can speedup the resolution of Pointer anomalies at a very nascent phase of a project, and buy a lot of time against fixing the bugs through manual analysis which is the traditional way. Here we look at a new way to attend to these problems with the help of this excellent feature, through early detection and reporting of bounds violations.

Point Checkerhasthree main features to detect pointer-related bugs, along with compiler options to add associated libraries, and to flag obvious or hidden flaws, which it helps detect at runtime.

  1. Checking of all indirect accesses through pointers and accesses to arrays.

    -check-pointers=[none | write | rw] (Linux* OS)

  2. Enable checking for dangling pointers.

    -check-pointers-dangling=[none | heap | stack | all] (Linux* OS)

  3. Enables the checking of bounds for arrays without dimensions.

    -[no-]check-pointers-undimensioned (Linux* OS)

Below is sample output of traceback report in command line for test2.cpp or test1.cpp.

We used -rdynamic for putting standard library symbols also into token table to know which function (here, strcpy) caused it, i.e, for better traceback information.

% icc -g -rdynamic -check-pointers=rw test2.cpp
 
%./a.out
 
CHKP: Bounds check error
 
    lb: 0x7fff13b868d0
 
    ub: 0x7fff13b868d7
 
  addr: 0x7fff13b868d8
 
   end: 0x7fff13b868d8
 
  size: 1
 
Traceback:
 
    at address 0x402c5d in function __chkp_strcpy
 
    in file unknown line 0
 
    at address 0x4026c6 in function main
 
    in file /home/cmplr/usr4/mkulka3/test2.cpp line 10
 
    at address 0x976df154 in function __libc_start_main
 
    in file unknown line 0
 
    at address 0x402569 in function __gxx_personality_v0
 
    in file unknown line 0
 
CHKP: Bounds check error
 
    lb: 0x7fff13b868d0
 
    ub: 0x7fff13b868d7
 
  addr: 0x7fff13b868d9
 
   end: 0x7fff13b868d9
 
  size: 1
 
Traceback:
 
    at address 0x402c5d in function __chkp_strcpy
 
    in file unknown line 0
 
    at address 0x4026c6 in function main
 
    in file /home/cmplr/usr4/mkulka3/test2.cpp line 10
 
    at address 0x976df154 in function __libc_start_main
 
    in file unknown line 0
 
    at address 0x402569 in function __gxx_personality_v0
 
    in file unknown line 0
 
CHKP: Bounds check error
 
    lb: 0x7fff13b868d0
 
    ub: 0x7fff13b868d7
 
  addr: 0x7fff13b868da
 
   end: 0x7fff13b868da
 
  size: 1
 
Traceback:
 
    at address 0x402c5d in function __chkp_strcpy
 
    in file unknown line 0
 
    at address 0x4026c6 in function main
 
    in file /home/cmplr/usr4/mkulka3/test2.cpp line 10
 
    at address 0x976df154 in function __libc_start_main
 
    in file unknown line 0
 
    at address 0x402569 in function __gxx_personality_v0
 
    in file unknown line 0
 
a=aaaaaaaabbb  b=bbb
 
CHKP Total number of bounds violations: 3
 

So, the 3-last bytes of array “a” was overwritten, and hence we see 3 bounds violations in report with details for each.

Also, the address 0x4026c6 in main can be mapped to line number as below:

%addr2line –e 0x4026c6 ./a.out
 
/home/cmplr/usr4/mkulka3/test2.cpp:10

The report briefly means:

CHKP: (Check Point)

Lb: lower bound address of violated array

Ub: upper bound of array

Addr: address outside the bound for OOB

End: end of address for OOB error

Size = size of the DATA TYPE getting Out-of-Bounds. If array is of type char, the size is 1, and if array data is integer type, then it will be 4 on most systems.

Generally, size = end – addr + 1;

Which is followed by traceback information for every OOB address violation.

As we can see, the problem is diagnosed straightway without manual effort, and with compiler doing the needful automation, which can be major boost to work productivity.

Caveat: Apart from detecting the above OOB and dangling-pointer bugs , Pointer Checker feature will not detect any memory leaks, which can be marked by Static Analysis (SA) feature of Intel® Inspector XE, and analyzed through Intel® Parallel Inspector product, both included in Intel® Parallel Studio XE product version 2011 or newer.

Some more examples of buffer overruns driven by user mistake:

/////////test3.cpp - wrong
 
int * j=(int*)malloc(4);   
 
j[0]=1;j[1]=2;j[2]=3;j[3]=4;    //  array writes
 
int i=j[0]+j[1]+j[2]+j[3];  // array reads
 
printf(“result = %dn”,i);
 
  
 
%./a.out
 
Result=10.

The small code snippet seems to work right here again, without wrong result. When in large code, it might cause unexpected behavior, or intermittently crash.

As you can see in above code, the allocation is 4-bytes only, not 4-integers. So, j[1], j[2], j[3] are OOB’s.

With Pointer Checker enabled, -check-pointers=rw , this will give number of bounds violations=6

3 for Reads, 3 for Writes, when program is run as shown previously. Hence, if –check-pointers=write is used, it will report only 3 OOB errors, i.e, 3 writes.

///////  test3.cpp – right
 
int * j=(int*)malloc(4*sizeof(int)); 

As we can see, the Pointer-enabled code detected the problem early on when the code was executed in initial code-running phase, and it helps in the code quality, reliability, and faster error-checking and troubleshooting, and less maintenance for later critical stages.

Another example is using sizeof() instead of strlen() for string types.

//////////test4.cpp - Wrong version:--
 
another_chptr = (char *) malloc (strlen(( char *)my_chptr));
 
memset (another_chptr, '@', sizeof(my_chptr));
 
  
 
//////test4.cpp - Correct version:--
 
another_chptr = (char *) malloc (strlen(( char *)my_chptr));
 
memset (another_chptr, '@', strlen(my_chptr));

Consider another example when a simple user mistake can lead to inconsistent results between runs, as tested on Windows* OS.

//////////test5.cpp 
 
int main()
 
{
 
char a[8];
 
char b[8];
 
memset(a,'a',7);
 
if(strlen(a)<8)
 
strcpy(b,"a<8byte");
 
else strcpy(b,"a>8byte");
 
printf("a=%s  b=%sn",a,b);
 
  
return 0;
 
}
 
  
C:Usersmkulka3DocumentsIntel-DocsKBstuffPoint>oob2.exe
 
a=aaaaaaa  b=a<8byte
 
  
C:Usersmkulka3DocumentsIntel-DocsKBstuffPoint>oob2.exe
 
a=aaaaaaa ├<#wε<#wÇ  b=a>8byte
 
  
C:Usersmkulka3DocumentsIntel-DocsKBstuffPoint>oob2.exe
 
a=aaaaaaa  b=a<8byte
 
  
C:Usersmkulka3DocumentsIntel-DocsKBstuffPoint>oob2.exe
 
a=aaaaaaa  b=a<8byte
 
  
C:Usersmkulka3DocumentsIntel-DocsKBstuffPoint>oob2.exe
 
a=aaaaaaa  b=a<8byte
 
  
C:Usersmkulka3DocumentsIntel-DocsKBstuffPoint>oob2.exe
 
a=aaaaaaa $  b=a>8byte

As you can see, there are 2-different results, a<8byte & a>8byte, which is unpredictable behaviour.

Here, the user used strlen with memset. If we see their definitions, memset just sets the memory locations to ‘a’, but will not act like string function and may (not) end it will ‘null’ . So, the result will be inconsistent.

The result may be consistent in different OS, which also depends on the difference in implementation of memset.

////////test5.cpp  -- correct
 
char a[8];
 
char b[8];
 
memset(a,'a',7);
 
strcpy(a,”aaaaaaa”);   // copy ‘a’ 7-times, as 8th char is ‘’
 
if(strlen(a)<8)
 
strcpy(b,"a<8byte");
 
else strcpy(b,"a>8byte");
 
printf("a=%s  b=%sn",a,b);
  1. As strlen() function is used for checking in if-else, if we use strcpy, also a string function, the result will be consistent.
  2. Or, instead of using strlen, we should use different test for the condition above.

Note: The above problem will NOT be detected by Pointer Checker, as there was actually no OOB violation or array overrun, though the combination of standard API’s were used incorrectly. Such issues may be revealed as warnings by SSA feature of Inspector. Only if strcpy overwrites the array will an OOB detected, though that problem type will be different than this incorrect use of API’s, which is not detected by Pointer Checker, and one of trivial limitations.

Many OOB problems in a complex application are difficult to detect, as they occur at runtime, which also depends on tests & different data-sets, & execution paths which will provoke the OOB’s. In such cases, the OOB issues detected will depend on data-sets, and other factors like code paths executed, and tested. Hence, Pointer Checker essentially is a dynamic runtime analysis tool for OOB, and may not detect for cases like non-executed code regions etc. and which may later be discovered when customers use different data for their testing.

Pointer Arithmetic and Pointer Checking

Pointer arithmetic does not affect the pointer checker. A pointer can go out of range as long as the pointer does not make an indirect reference to an out of range address.

In the case where you create an array with 100 elements, the following applies:

char *p = malloc(100);
 
p += 200;         // pointer is out of range, but no error
 

p[-101] = 0; // access is still in range,
 
             // i.e. it is the original p[99]
 
p[0] = 0;    // out-of-bounds error occurs here
 
             // because it is original p[200] 

We discussed using –check-pointers option to analyze buffer overruns & pointer access problems beyond allocated boundaries. Another useful debugging feature is to detect dangling pointers.

In the next article on Pointer Checker feature, we will discuss finding bugs related to dangling pointers in stack/heap memory with few examples, and finish with guidelines and tips to efficiently use this feature.

For more complete information about compiler optimizations, see our Optimization Notice.