OpenCV Compilation accuracy errors

OpenCV Compilation accuracy errors

Engin F.'s picture

Hello all,

 I have compiled OpenCV by using Intel C++ Compiler. I have used the following flags for compilation:

-O3 -xAVX -march=corei7-avx -openmp -parallel -ipp -mkl -tbb -opt-matmul -std=c++0x -g0 

The problem is that: Whenever I run opencv unit tests, some tests are failed due to accuracy problems.  Unit tests output for OpenCV core module is attached to this post. 

Due to these problems, I have read about floating point operations from compiler's reference guide. 

As I have learnt from compiler's reference guide default fp-model is fast, then I have used following option to generate consistent floating point operations: -fp-model precise. This option is not working too.

I work on a 32-bit Ubuntu distro, and using 32 bit compiler environment variables by using following command:

source /path/to/compiler/compilervars.sh -ia32

What do you suggest?

Regards.

AttachmentSize
Download core.txt37.19 KB
21 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.
Sergey Kostrov's picture

>>...What do you suggest?

You need to provide an exact example of some inaccuracy with technical details. Please do not try to cover all cases and at the beginning at least one case will be enough for investigation.

Sergey Kostrov's picture

>>...[ FAILED ] Core_DFT.accuracy

Since you're building the OpenCV binaries from sources switch the configuration to Debug and try to understand why that test fails during debugging.

>>...I have used following option to generate consistent floating point operations: -fp-model precise. This option is not working too...

More technical details, please. All options related to fp-model are fundamental ( in terms that they are very old ) and if something is wrong than it would affect another developers, teams, projects, etc and the forum would be filled with reports instantly.

Engin F.'s picture

The only option related with -fp-model was -fp-model precise. I have written all used flags in my post. 

When I inspect failed unit tests, I have realized that, unit test are not just fail for floating point arithmetics, but also faling for integer arithmetics. For example, for reduce function of opencv, related unit test fails for all types of opencv, eg. Reduce function with following arguments: srcType = CV_8UC1, dstType = CV_8UC1, opType = CV_REDUCE_MAX, dim = COLS. In this operation only integer operations should be included, but bad accuracy error is reported from unit test.

I have researched on whether a related bug is reported for unit tests, but I could not found any related bug with these unit tests. 

I have inspect unit tests for nearly ~15 different compiler flag sets, and I have realized that whenever I switch to -O2 from -O0, these bad accuracy related problems are reported. For example the following 2 flag sets are interesting:

-O0 -fp-model fast=2 (here I have enforced to get some floating point accuracy problems, but all unit tests are passed, these are the all flags I have used)

-O2 -fp-model fast=2 (unit tests are failed, these are the all flags I have used)

(In the attached unit test result file, you can see detailed results for test marked with ###OPENCV REDUCE FUNTION###)

With these results at hand, I have started to think unit test problems are related with vectorization or other optimization procedures of compiler. But how integer arithmetics are impacted from these procedures?

While I have been researching on this issue, I have realized that, some people have some difficulties with cmake and icc, eg not detected compilers etc. Here is my compilation procedure:

1. source /path/to/compiler/bin/iccvars.sh ia32

2. export CC=icc

3. export CXX=icpc

4. cmake -i /path/to/source

I have checked that this procedure should be correct, but I have shared so that you can check that too.

But I have detected a suspicious item: In verbose mode of cmake, I have realized that compiler was fed with a flag named -fsigned-char although I haven't used that flag. I couldn't solve that problem. I indicating this situation here, since issues may be related with this. I will share this situation on a cmake related community, and share the results as some response.

I am working on a Ubuntu 32-bit machine, with gcc4.6 and icc 13.1. OpenCV version is 2.4.6.1.

Attachments: 

AttachmentSize
Download result-set.txt37.01 KB
Sergey Kostrov's picture

>>...When I inspect failed unit tests, I have realized that, unit test are not just fail for floating point arithmetics, but also faling
>>for integer arithmetics. For example, for reduce function of opencv, related unit test fails for all types of opencv...

Try to be as specific as possible. There are tens of different Intel integer instructions and your generic explanations do not help to pin down a possible reason of all these problems.

Engin F.'s picture

Sorry, first post is incomplete. I have completed it. 

Try to be as specific as possible. There are tens of different Intel integer instructions and your generic explanations do not help to pin down a possible reason of all these problems.

But I expect, you will find this revised post generic too. Could you please say me what specific result should I use to inform you? 

Sergey Kostrov's picture

>>...For example, for reduce function of opencv, related unit test fails for all types of opencv, eg. Reduce function with
>>following arguments: srcType = CV_8UC1, dstType = CV_8UC1, opType = CV_REDUCE_MAX, dim = COLS. In this
>>operation only integer operations should be included, but bad accuracy error is reported from unit test...
>>...
>>...I have realized that whenever I switch to -O2 from -O0, these bad accuracy related problems are reported...

Could you create a small reproducer, or upload a source file where you see a problem with annotations?

Sergey Kostrov's picture

>>Could you please say me what specific result should I use to inform you?

Let's summarize the situation:

1. Some tests are failing when compiler optimizations are changed to -O2 from -O0
2. There are problems with FP-calculations
3. There are problems with INTEGER-calculations

So, please select just one Test Case with a smallest OpenCV function ( look at OpenCV source codes ). Upload source files for the function and test case for review.

Did you try to debug some test cases and functions?

Note: Do not forget to do two verifications, that is when -O0 is used and after that when -O2 is used. If debugging is not possible a simple 'printf'-like approach could help as well. For example:

...
void SomeFunction( void )
{
printf( "Start\n" );
...
// Some block of codes 1
printf( "1\n" );
...
// Some block of codes 2
printf( "2\n" );
...
// Some block of codes 3
printf( "3\n" );
...
// Some block of codes 4
printf( "4\n" );
...
printf( "Finish\n" );
}
...

Two Outputs are possible:

[ Test 1 - Option -O0 used - No Errors ]
...
Start
1
2
3
4
Finish
...

[ Test 2 - Option -O2 used - Some Errors ]
...
Start
1
2
3
Finish
...

In overall, you need to see where exactly (!) some function fails and in the 2nd Test it fails when a block of codes 4 is executed.

Engin F.'s picture

Kostrov, thank you so much for your detailed interest. 

I have just read your last post. I understand the case now. In order to debug opencv, I will create an Eclipse project. Please give me some time for preparation. 

Before I have read your last post, I have prepared some source codes for you related with unit tests. I have chosen OpenCV function reduce. In last post, I have pointed to fails related with this function. I will appreciate you if you can find some time to inspect them. 

In this post, you will find three files attached to this post.
1. ts.cpp
2. test_mat.cpp
3. reduce.cpp

ts.cpp contains BaseTest class implementation which is base class for all test case classes.
test_mat.cpp contains reduce function implementation. I have simplified this source code so that it only contains test case implementations of reduce function.
reduce.cpp contains original implementation of reduce function in opencv. You can find detailed information about opencv at this page: http://docs.opencv.org/modules/core/doc/operations_on_arrays.html?highlight=reduce#cv.Reduce

I have tried to simplify your work on inspection as possible. In order to do that I have commented on source code files. You will find some labels. These labels indicate the order you should follow while inspecting. You should start from test_mat.cpp line 317 with label "a".

Thanks a lot.

Attachments: 

AttachmentSize
Download reduce.cpp.txt10.29 KB
Download test-mat.cpp.txt11.92 KB
Download ts.cpp.txt17.99 KB
Sergey Kostrov's picture

Lots of technical details and I will take a look. Thanks.

Engin F.'s picture

Here additioanal technical details, before going into the Eclipse, I have modified test_mat.cpp to see the outputs of the operations.

In attached files, matrix outputs for all phases can be found. There is a naming convention for the attached files:

There are three types of outputs: 

The first category of output files starting with "Generated_" prefix, second starting with "NonOptimized_" prefix, third starting with "Optimized_" prefix. Then the file names are continuing by using the following pattern:

[COLUMN?ROW]_[SIZE_WIDTH]_[SIZE_HEIGHT]_[REDUCE_OP_SOURCE_TYPE]_[REDUCE_OP_DESTINATION_TYPE].yml

For example:

The file name Optimized_1_1_COLUMN_8UC1_8UC1.yml means, original reduce function output by using randomly generated input matrix which is type 8UC1. Output is of type 8UC1 and reduce function performs a COLUMN reduction. 

All three categories of files in specified folder name.

I don't inspect the results, I have found it reasonable to share them with you immediately, since these outputs may help you on investigation. I am going to inspect them after posting this message. 

Regards. 

Attachments: 

AttachmentSize
Download yml-output.zip31.43 MB
Sergey Kostrov's picture

Engin, Please try to create an Eclipse project ( ASAP ) in order to see the problem at source codes level under a Debugger. As promised, I would look at a previous test data, that is, a small version ( Not 33.5MB you've uploaded recently ).

Engin F.'s picture

Ok. I couldn't work on this, I am going work on it on Monday. Thank you so much Kostrov.

Engin F.'s picture

I have integrated Intel to Eclipse, and I have created an OpenCV Eclipse project. I have tried to debug but there is a problem. When I put a breakpoint on source code, program hangs on a different position and I never see program hits at my breakpoint. Hence I cannot debug OpenCV library.

I have tried to find a solution to this problem with no success. 

I will inform you whenever I solve the problem.

Regards.

Sergey Kostrov's picture

>>...If debugging is not possible a simple 'printf'-like approach could help as well...

Thanks for the update. Try Debugging and 'printf-like approaches. When you output values with printf' function ( I assume that you place them in right pieces of codes ) it could help you significantly and you will save time ( there is No need to understand what is wrong with Debugging ). So, a compromise between these two techniques is the best way to proceed.

Engin F.'s picture

Ok, I will go on that way and inform you as some results are obtained.

Regards.

Engin F.'s picture

Hello,

I have debugged reduce test and seen where the problems are. Alternative implementations for these problematic areas solve the accuracy problem of the reduce test. Let me explain what I have obtained from debugging with a printf approach:

First of all, optimized implementation of reduce function, in other words the original implementation is working accurate. But test case fails because, there are problems on non-optimized implementation of reduce function and comparison code of optimized and non-optimized matrices. I will explain problematic areas later. But before, I can say that, problems are occurring on primitive OpenCV cases. You will understand well when I explain the problems and their solutions.

Three work-arounds solve the accuracy problem for reduce function.

1. Matrix initialization:
In non-optimized function implementation, I have realized that generated matrices for min, max, sum are initialized wrongly. The original one uses the following implementation for matrix initialization:

cv::Mat sum; //Declaring matrix sum
sum.create (1,100, CV_64F); //Creating matrix, a 1-100 matrix, with element type 64bit double precision floating point
sum.setTo(0); //Set all elements to 0.0

I have realized that, there are some elements that are not set to 0, some elements are too big numbers and some are too small numbers.

As a work around I have implemented an alternative solution:

cv::Mat sum; /Declaring matrix sum
sum = cv::Mat::zeros(1,100,CV_64F); //Creating it and filling all elements to 0.

This way, I have seen that matrix sum are created correctly.

Moreover, creating a matrix and filling all elements to DBL_MAX and -DBL_MAX is also problematic:
cv::Mat min, max;
min.create(1,100, CV_64F);
max.create(1,100, CV_64F);
min.setTo(DBL_MAX);
max.setTo(-DBL_MAX);

This usage generates wrong matrices and I have changed the implementation as follows:

cv::Mat min,max;
min = cv::Mat(1,100,CV_64F, DBL_MAX);
max = cv::Mat(1,100,CV_64F, -DBL_MAX);

By implementing in this way problem has solved.

2. Matrix comparison

As I have stated before, two results of two different implementations of reduce functions (optimized and non-optimized ones) are compared to check whether optimized one generates correct results. But I have realized that, comparison code block is not working correctly.

Original implementation was as follows:

Assume that, opRes and dst are results of optimized and non-optimized results of reduce implementations respectively. diff is difference matrix between opRes and dst. All of these are type of cv::Mat.

absdiff( opRes,dst,diff );
bool check = false;
if (dstType == CV_32F || dstType == CV_64F)
check = countNonZero(diff>eps*dst) > 0;
else
check = countNonZero(diff>eps) > 0;

In this implementation I have realized that, diff > eps*dst and diff > eps generates correct results, ie all the elements are 0, but countNonZero produces wrong outputs. countNonZero states that, some elements are different than zero, but this is not the case. Hence according test fails. A primitive function of OpenCV countNonZero() generates inappropriate results.

A work around is as follows:
When I have obtained diff matrix, in a for loop I have iterate through diff matrix and in each iteration I have checked whether current element is bigger than the specified threshold. In this way, comparison phase of the test generates correct results.

3. Average computation

Reduce function can calculate average of rows or columns. I have realized that something goes wrong with average calculation.

Original version of non-optimized reduce function is as follows:

sum.convertTo( avg, CV_64FC1 ); //write all elements of sum to avg in type of 64FC1
avg = avg * (1.0 / (dim==0 ? (double)src.rows : (double)src.cols)); //calculate average, denominator changes as is this a row or column operation

But I have realized that outputs of this operations are wrong. 

Work around around this problem is as follows:

In a for loop I have iterate over all members of sum. In each iteration results of sum[] / denominator is written to avg. In this way, average is calculated correctly.

Here, primitive OpenCV operation operator *() generates wrong outputs.

All these three work arounds are solved the problem and now accuracy tests are passing. But there should be a different problem somewhere. I don't think there are problems on implementations of primitive OpenCV functions. I cannot understand where the problem is. 

In the attachment to this post you can find detailed results and detailed explanations for work arounds.

Regards.

Attachments: 

AttachmentSize
Download test-results.tar.gz26.75 KB
Sergey Kostrov's picture

After reading your latest post I've concluded that all problems are specific to OpenCV and they are Not related to Intel C++ compiler. Is that correct?

Engin F.'s picture

Yes there are problems with OpenCV functions but I don't think OpenCV functions are buggy. They shall be bug-free because they are primitive ones. I think there is a problem with my compilation phase. I am suspicious from CMake configuration phase hence I have written to CMake mailing list, bu I cannot get any help from there. You can find the thread opened here from :http://www.cmake.org/pipermail/cmake/2013-August/055425.html

Regards.

Sergey Kostrov's picture

Thanks for the update.

>>...I have used the following flags for compilation:
>>
>>-O3 -xAVX -march=corei7-avx -openmp -parallel -ipp -mkl -tbb -opt-matmul -std=c++0x -g0

Try to do additional verifications in order to see that outputs are consistent using -O2, -O1 and -Od optimization options.

Note: If you need as accurate as possible results then precise Floating-Pont Model needs to be used. For example, in a recently completed test for a complex image processing subsystem a change from precise to fast resulted in a loss of accuracy. Instead of 1.000000 values 1.000001 and 0.999999 were calculated in some cases and even if a loss is only 0.000001 it could result in a significantly higher accumulated error.

Engin F.'s picture

Hello,

I have tried with compiler flags: -O3 -xAVX -march=corei7-avx -openmp -parallel -ipp -mkl -tbb -opt-matmul -std=c++0x -g0 and there is no problem with reduce function. 

Actually, if you inspect test results you will see that following tests are failing:

Core_AddS/ElemWiseTest.accuracy
Core_SubRS/ElemWiseTest.accuracy
Core_AbsDiffS/ElemWiseTest.accuracy
Core_AndS/ElemWiseTest.accuracy
Core_OrS/ElemWiseTest.accuracy
Core_XorS/ElemWiseTest.accuracy
Core_MaxS/ElemWiseTest.accuracy
Core_MinS/ElemWiseTest.accuracy
Core_CmpS/ElemWiseTest.accuracy
Core_InRangeS/ElemWiseTest.accuracy
Core_CountNonZero/ElemWiseTest.accuracy

These are primitive operations and note that CountNonZero test also fails. If you remember, one work around to solve the problem of reduce accuray test was implementing an alternative method for CountNonZero. I will inspect why tests above are failing.

Login to leave a comment.