Using TBB Task Scheduler to calculate determinant of matrix

Using TBB Task Scheduler to calculate determinant of matrix

Having a problem with understanding why task scheduler makes program slower than its serial version is...

Here is serial version of my program which should calculate determinant of given matrix:

double serial(vector<vector<double>> matrix)
{
	double matrix_ord = matrix.size();
	double det = 0;

	if (matrix_ord == 1)
		det = matrix[0][0];
	else
	{
		for (int i = 0; i < matrix_ord; i++)
		{
			vector<vector<double>> minor = get_minor(matrix, i);

			bool sign = (i % 2 == 0) ? 1 : -1;
			det += matrix[0][i] * serial(minor) * sign;
		};
	}

	return det;
}

Because it is reccursive function, using TBB tasks i made Recursive chain reaction in parallel version, just like Fibonacci numbers example in TBB documentation:

First i am creating root task:

double parallel(vector<vector<double>> matrix)
{

	double det = 0;

	MyTask& my_task= *new(task::allocate_root()) MyTask(&det, matrix);
	task::spawn_root_and_wait(my_task);

	return det;
}

MyTask class with execute() function looks like this:

class MyTask: public task {
public:
	vector<vector<double>> matrix;
	double* det;

	MyTask(double* det_, vector<vector<double>> matrix_) : 
		det(det_), matrix(matrix_)  {}

	tbb::task* execute() 
	{
		int matrix_ord = matrica.size();

		if (matrix_ord == 1)
			*det= matrix[0][0]; 
		else
		{
			vector<double> determinants_of_minors(matrix_ord, 0);
			
			set_ref_count(matrix_ord+ 1);

			MyTask& first_task= *new (allocate_child()) MyTask(&determinants_of_minors.at(0), get_minor(matrix,0));
			
			for (int i = 1; i < matrix_ord; i++)
			{
				vector<vector<double>> minor = get_minor(matrix, i);

				MyTask& task_ = *new (allocate_child()) MyTask(&determinants_of_minors.at(i), minor);
				spawn(task_);
			}

			spawn_and_wait_for_all( first_task );

			for (int i = 0; i < matrix_ord; i++)
			{
				bool sign = (i % 2 == 0) ? 1 : -1;
				*det += matrix[0][i] * determinants_of_minors.at(i) * sign;
			}
		}

		return NULL;
	}
};

As you can see, the idea is to create subtasks for every minor of current matrix till matrix size is 1x1 -> then the task is returning the only element of matrix to its successor.

Where the problem could be?

2 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hi Djordje,

Fibonacci is an illustrative example. However, the explanation of it contains an essential part - "CutOff" parameter. I am sure that because of the program you provided does has anything to do with this parameter, it performs slower then the serial counterpart. The link to the right place in the docs: https://www.threadingbuildingblocks.org/docs/help/hh_goto.htm?index.htm#tbb_userguide/Simple_Example_Fibonacci_Numbers.html

Regards, Aleksei

Leave a Comment

Please sign in to add a comment. Not a member? Join today