Using New Task API feature of Intel® VTune™ Amplifier XE 2013

 With the New VTuneTM Amplifier XE 2013, we have introduced the new Task API feature,  Which would be helpful in Visualizing and label the User defined tasks.

A task is a logical unit of the work performed by one or multiple threads. We can use Task API’s to assign tasks to threads.

These are some of the Task API’s that are used to visualize the User tasks.

API

Functionality

void ITTAPI__itt_task_begin ( const __itt_domain *domain, __itt_id taskid, __itt_id parentid, __itt_string_handle *name)

Create a task instance on a thread. This becomes the current task instance for that thread. A call to __itt_task_end() on the same thread ends the current task instance.

void ITTAPI__itt_task_begin_fn ( const __itt_domain *domain, __itt_id taskid, __itt_id parentid, void *fn)

Begin a task instance on a thread.

void ITTAPI__itt_task_end ( const __itt_domain *domain

End a task instance on a thread.

Parameter

Description

__itt_domain

The domain of the task.

__itt_id taskid

This is a reserved parameter.

__itt_id parentid

This is a reserved parameter.

__itt_string_handle

The task string handle.

*fn

This is a reserved parameter.

How to use these Task API’s?

We may need to follow 5 steps to enable and take advantage of Task API’s:-

1.       Include “ittnotify.h” header.

2.       Create __itt_* handles.

3.       Insert task begin and end marks in your code.

4.       Link to libittnotify.lib.

5.       Enable “Analyze user tasks” before  profiling.

using namespace std;

#define NUM 1024   #define NUM_THREADS 4

 __itt_domain* domain=__itt_domain_create(L"Task Domain");

__itt_string_handle* UserTask=__itt_string_handle_create(L"User Task");

__itt_string_handle* UserSubTask=__itt_string_handle_create(L"UserSubTask");

 void do_foo(double seconds)

{

       int a[200][200],b[200][200],c[200][200],i,j,k,sum=0;

               for(i=0;i<200;i++)

              {

                     for(j=0;j<200;j++)

                     {

                           a[i][j]=1;

                     }

              }

 

              for(i=0;i<200;i++)

              {

                     for(j=0;j<200;j++)

                     {

                           b[i][j]=1;

                     }

              }

             

              for(i=0;i<200;i++)

              {

                     for(j=0;j<200;j++)

                     {

                           sum=0;

                           for(k=0;k<200;k++)

                           {

                           sum+=a[i][k]*b[k][j];

                           }

                           c[i][j]=sum;

                     }

              }

               cout<<"The last variable is "<<c[199][199]<<endl;

}

 DWORD WINAPI work (void *pArg)

{

       __itt_task_begin (domain, __itt_null, __itt_null, UserSubTask);

       do_foo (1);

       __itt_task_end (domain);

       return 0;

}

 int main()

{

       int i = 0;

       HANDLE hThread [NUM_THREADS];

       __itt_task_begin (domain, __itt_null, __itt_null, UserTask);

        do_foo (1);

       for (i = 0; i < NUM_THREADS; i++)

       {

              hThread[i] = CreateThread ( NULL, 0, work, (void*)i, 0, 0 );

       }

       // Wait for the thread to signal one of the event objects

       WaitForMultipleObjects (NUM_THREADS, hThread, TRUE, INFINITE );

       __itt_task_end (domain);

       getch();

       return 0;

}

So as we could see , we have just included the user code of Matrix arithmetic inside an do_foo() function and bound that function call with the Task begin and End API’s.

How to visualize the user-tasks in VTuneTM  Amplifier Tasks pane?

Once we have encapsulated the User code with the Task API’s, we can Build the project and start profiling using user-mode sampling collection, e.g. Hotspot analysis of VTune™ Amplifier XE. Now we would be able to see the names of the tasks that we have mentioned in the “__itt_task_begin” API (UserSubTask & User Task in the above code snippet). Also when we select the “Tasks” tab in the Analysis pane, we should be able to see the CPU time of each thread for the task defined.

As we can see in the above figure the names of each sub tasks have been mentioned & highlighted, below the CPU usage graph. Likewise all the software Algorithm analysis can take the advantage of Task API’s to visualize the User tasks.

Regards,

Sukruth H V

Tags:
Para obter mais informações sobre otimizações de compiladores, consulte Aviso sobre otimizações.