Pointer Checker to debug buffer overruns and dangling pointers (Part 1)

Article topic

Pointer Checker to debug buffer overruns and dangling pointers

Next article

Pointer Checker to detect buffer overflows and dangling pointers (part 2)

Overview

A buffer overflow or overrun is a program anomaly caused when memory is accessed or written outside the region allocated for it, which may be inside the pre-allocated memory of the whole program, or outside the program itself when it overflows outside of the range of pre-allocated memory of stack or program.  Though in protected mode, the OS & other applications are protected from the overruns from other applications. The present C/C++ front-ends do not actually have any built-in mechanism to do any boundary checking with regards to arrays or pointers. It depends on the OS to ensure that you are accessing valid memory. 

There are several times when the code does not crash or segfaults, and at the critical time to market, the unexpected happens. So, if program works correctly for all the thorough testing stages, developer may think all is right. Or if the developer accidentally discovers somehow an Out-Of-Bounds error in code, he/she is left wondering why the code does not crash. And if it does crash, the developer may not bother about fixing the issue, depending on the the complexity and cost incurred, leaving the system vulnerable to security breach or malicious attack. There may be a few situations which could be problematic:

  • The program may work correctly most of the time
  • The program may result in unexpected behavior – when accesses are done within the memory that program has allocated
  •  Segfault – when accesses are made outside the program allocated memory. 

To illustrate , Consider a simple program

 

//////test1.cpp

#include <stdio.h>

#include <stdlib.h>

#include <string.h>

int main()

{

char a[8];

char*b=(char*)malloc(4);

strcpy(a,"aaaaaaaaaa");

strcpy(b,"bbb");

printf("a=%s  b=%sn",a,b);

return 0;

}

 

%icc test.cpp

%./a.out

a=aaaaaaaaaa  b=bbb

 

 

Until now , the user thinks the code works perfectly, as judged by the results. And we do get the expected result, even when the allocated bytes are less than the written ones.

 Now if we replace the allocation with malloc() with a simple array b[4], the result is different.

 

  


////////////test2.cpp

 

char a[8];

char b[4];

strcpy(a,"aaaaaaaaaa");     //////line 10

strcpy(b,"bbbb");

printf("a=%s  b=%sn",a,b);

return 0;

 

However , as in above code if we allocate b on different memory area, i.e, stack instead of heap, the result is:--

 

%icc test2.cpp

%./a.out

 

a=aaaaaaaabbb  b=bbb

 

 

In both the above, the purpose of developer is same, but his/her implementation is different.

The problem for both is that C/C++  compilers do not do any boundary checking for any memory type, probably due to memory checking overheads and the housekeeping involved.

In both the cases,  intent was the same for both the strings, but in 1st case, the output looks to be correct, as “b” memory area of heap could not overwrite the array “a” on stack.  Though the output is correct, there is a potential hidden flaw that can go undetected risking data corruption and increasing vulnerability to errors, which may show if some slight change is done.

 In the 2nd case, you are declaring both as stack based arrays. Depending upon the particular implementation by user, accessing outside the bounds of the array will simply access another part of the already allocated stack space or another stack variable.  which happens here when both “a” & “b” are allocated on the stack and “b” overwrites “a”. Because “a” was shorter by few bytes, which could be taken care if size is increased to 11. 

  

TOPIC

Pointer-Checker

 The Intel Composer XE 2013 product release contains Pointer Checker , a new testing and debugging feature to help find such buffer overruns occurring in Windows* OS or Linux* OS based applications.  The Pointer Checker, performs bounds checking—providing full checking of all memory accesses through pointers—and identifies any out-of-bounds access in Pointer Checker-enabled code.

In this article, we will explore a few simple scenarios that can yet occur in a complex code environment.

We will be exploring compiler options and reports with some small examples showing different types of pointer problems that can occur. The purpose is to present how we can speedup the resolution of Pointer anomalies at a very nascent phase of a project, and buy a lot of time against fixing the bugs through manual analysis which is the traditional way. Here we look at a new way to attend to these problems with the help of this excellent feature, through early detection and reporting of bounds violations.

 Point Checker has three main features to detect pointer-related bugs, along with compiler options to add associated libraries,  and to flag obvious or hidden  flaws, which it helps detect at runtime.

 

1.checking of all indirect accesses through pointers and accesses to arrays.

-check-pointers=[none | write | rw] (Linux* OS)

   2.Enable checking for dangling pointers.

              -check-pointers-dangling=[none | heap | stack | all] (Linux* OS)

   3.Enables the checking of bounds for arrays without dimensions.

-[no-]check-pointers-undimensioned (Linux* OS)

 Below is sample output of traceback report in command line for test2.cpp or test1.cpp.

 

We used -rdynamic for putting standard library symbols also into token table to know which function (here, strcpy) caused it, i.e, for better traceback information.

  


% icc -g -rdynamic -check-pointers=rw test2.cpp

%./a.out

 

CHKP: Bounds check error

    lb: 0x7fff13b868d0

    ub: 0x7fff13b868d7

  addr: 0x7fff13b868d8

   end: 0x7fff13b868d8

  size: 1

Traceback:

    at address 0x402c5d in function __chkp_strcpy

    in file unknown line 0

    at address 0x4026c6 in function main

    in file /home/cmplr/usr4/mkulka3/test2.cpp line 10

    at address 0x976df154 in function __libc_start_main

    in file unknown line 0

    at address 0x402569 in function __gxx_personality_v0

    in file unknown line 0

CHKP: Bounds check error

    lb: 0x7fff13b868d0

    ub: 0x7fff13b868d7

  addr: 0x7fff13b868d9

   end: 0x7fff13b868d9

  size: 1

Traceback:

    at address 0x402c5d in function __chkp_strcpy

    in file unknown line 0

    at address 0x4026c6 in function main

    in file /home/cmplr/usr4/mkulka3/test2.cpp line 10

    at address 0x976df154 in function __libc_start_main

    in file unknown line 0

    at address 0x402569 in function __gxx_personality_v0

    in file unknown line 0

CHKP: Bounds check error

    lb: 0x7fff13b868d0

    ub: 0x7fff13b868d7

  addr: 0x7fff13b868da

   end: 0x7fff13b868da

  size: 1

Traceback:

    at address 0x402c5d in function __chkp_strcpy

    in file unknown line 0

    at address 0x4026c6 in function main

    in file /home/cmplr/usr4/mkulka3/test2.cpp line 10

    at address 0x976df154 in function __libc_start_main

    in file unknown line 0

    at address 0x402569 in function __gxx_personality_v0

    in file unknown line 0

 

a=aaaaaaaabbb  b=bbb

CHKP Total number of bounds violations: 3

 

So, The 3-last bytes of array “a” was overwritten, and hence we see 3 bounds violations in report with details for each.

Also, the address 0x4026c6 in main can be mapped to line number as below:-
 


%addr2line –e 0x4026c6 ./a.out

/home/cmplr/usr4/mkulka3/test2.cpp:10

 

The report briefly means:--

CHKP:    (Check Point)

Lb:  lower bound address of violated array

Ub: upper bound of array

Addr: address outside the bound for OOB

End: end of address for OOB error

Size = size of the DATA TYPE getting Out-of-Bounds. If array is of type char, the size is 1, and if array data is integer type, then it will be 4 on most systems.

Generally, size = end – addr + 1;

 

Which is followed by traceback information for every OOB address violation

 

 As we can see, the problem is diagnosed straightway without manual effort, and with compiler doing the needful automation, which can be major boost to work productivity.

 

 CAVEAT: Apart from detecting the above OOB and dangling-pointer bugs , Pointer Checker feature will not detect any memory leaks, which can be marked by Static Analysis (SA) feature of Intel® Inspector XE, and analyzed through Parallel Inspector XE product, both included in C++ or Parallel Studio XE 2011 & 2013 products.

 

 Some more examples of buffer overruns driven by user mistake:--


/////////test3.cpp - wrong

int * j=(int*)malloc(4);    

j[0]=1;j[1]=2;j[2]=3;j[3]=4;    //  array writes

int i=j[0]+j[1]+j[2]+j[3];  // array reads

printf(“result = %dn”,i);

 

%./a.out

Result=10.

 The small code snippet seems to work right here again, without wrong result.  When in large code, it might cause unexpected behavior, or intermittently crash.

As you can see in above code, the allocation is 4-bytes only, not 4-integers. So, j[1], j[2], j[3] are OOB’s.

With Pointer-checker enabled, -check-pointers=rw , this will give number of bounds violations=6

3 for Reads, 3 for Writes, when program is run as shown previously. Hence, if –check-pointers=write is used, it will report only 3 OOB errors, i.e, 3 writes.

 


///////  test3.cpp – right

int * j=(int*)malloc(4*sizeof(int));    

As we can see, the Pointer-enabled code detected the problem early on when the code was executed in initial code-running phase, and it helps in the code quality, reliability, and faster error-checking and troubleshooting, and less maintenance for later critical stages.

Another example is using sizeof() instead of strlen() for string types.


//////////test4.cpp - Wrong version:--

another_chptr = (char *) malloc (strlen(( char *)my_chptr));

memset (another_chptr, '@', sizeof(my_chptr)); 

 

//////test4.cpp - Correct version:--

another_chptr = (char *) malloc (strlen(( char *)my_chptr));

memset (another_chptr, '@', strlen(my_chptr));

Consider another example when a simple user mistake can lead to inconsistent results between runs, as tested on Windows* OS.


//////////test5.cpp  

int main()

{

char a[8];

char b[8];

memset(a,'a',7);

if(strlen(a)<8)

strcpy(b,"a<8byte");

else strcpy(b,"a>8byte");

printf("a=%s  b=%sn",a,b);

 

return 0;

}

 

C:Usersmkulka3DocumentsIntel-DocsKBstuffPoint>oob2.exe

a=aaaaaaa  b=a<8byte

 

C:Usersmkulka3DocumentsIntel-DocsKBstuffPoint>oob2.exe

a=aaaaaaa ├<#wε<#wÇ  b=a>8byte

 

C:Usersmkulka3DocumentsIntel-DocsKBstuffPoint>oob2.exe

a=aaaaaaa  b=a<8byte

 

C:Usersmkulka3DocumentsIntel-DocsKBstuffPoint>oob2.exe

a=aaaaaaa  b=a<8byte

 

C:Usersmkulka3DocumentsIntel-DocsKBstuffPoint>oob2.exe

a=aaaaaaa  b=a<8byte

 

C:Usersmkulka3DocumentsIntel-DocsKBstuffPoint>oob2.exe

a=aaaaaaa $  b=a>8byte 

As you can see , there are 2-different results, a<8byte & a>8byte, which is unpredictable behaviour.

 

Here, the user used strlen with memset. If we see their definitions, memset just sets the memory locations to ‘a’, but will not act like string function and may (not) end it will ‘null’ . So, the result will be inconsistent.

The result may be consistent in different OS, which also depends on the difference in implementation of memset.


////////test5.cpp  -- correct

char a[8];

char b[8];

memset(a,'a',7);

strcpy(a,”aaaaaaa”);   // copy ‘a’ 7-times, as 8th char is ‘’

if(strlen(a)<8)

strcpy(b,"a<8byte");

else strcpy(b,"a>8byte");

printf("a=%s  b=%sn",a,b);

1.   As strlen() function is used for checking in if-else, if we use strcpy, also a string function, the result will be consistent.

2.   Or, instead of using strlen, we should use different test for the condition above.

Note:-- The above problem will NOT be detected by Pointer Checker, as there was actually no OOB violation or array overrun, though the combination of standard API’s were used incorrectly. Such issues may be revealed as warnings by SSA feature of Inspector. Only if strcpy overwrites the array will an OOB detected, though that problem type will be different than this incorrect use of API’s, which is not detected by Pointer Checker, and one of trivial limitations.

Many OOB problems in a complex application are difficult to detect, as they occur at runtime, which also depends on tests & different data-sets, & execution paths which will provoke the OOB’s. In such cases, the OOB issues detected will depend on data-sets, and other factors like code paths executed, and tested. Hence, Pointer Checker essentially is a dynamic runtime analysis tool for OOB, and may not detect for cases like non-executed code regions etc. and which may later be discovered when customers use different data for their testing.

Pointer Arithmetic and Pointer Checking

 

Pointer arithmetic does not affect the pointer checker. A pointer can go out of range as long as the pointer does not make an indirect reference to an out of range address.

 

In the case where you create an array with 100 elements, the following applies:

 


 

char *p = malloc(100);

 

p += 200;         // pointer is out of range, but no error

 

p[-101] = 0; // access is still in range, 

 

             // i.e. it is the original p[99]

 

p[0] = 0;    // out-of-bounds error occurs here

 

             // because it is original p[200]

 

 

We discussed using –check-pointers option to analyze buffer overruns & pointer access problems beyond allocated boundaries. Another useful debugging feature is to detect dangling pointers.

 

In the next article on Pointer Checker feature, we will discuss finding bugs related to dangling pointers in stack/heap memory with few examples, and finish with guidelines and tips to efficiently use this feature.

 

Per informazioni più dettagliate sulle ottimizzazioni basate su compilatore, vedere il nostro Avviso sull'ottimizzazione.