Example of arbb_vmapi?

Example of arbb_vmapi?

Hello,Is there an example anywhere of how to build and execute functions directly using the arbb_vmapi interface? Even a simple function to sum a vector would be a useful starting point if I could see how to execute it (the building of it seems easy enough using the functions I've found in arbb_vmapi).ThanksBill

10 posts / 0 nouveau(x)
Dernière contribution
Reportez-vous à notre Notice d'optimisation pour plus d'informations sur les choix et l'optimisation des performances dans les produits logiciels Intel.

Take a look at the following resources for a primer on the VM API:

UIUC UPCRC Research Seminar video recording - The Intel Array Building Blocks Virtual Machine

http://media.cs.illinois.edu/DCS/research/upcrc/UPCRC-2010-11-04.asx

Here are the corresponding slides:

The+Intel%C2%AE+Array+Building+Blocks+Virtual+Machine.pdf

Here is an example using the VM API.

VM+example.pdf

Here is an example adding two scalars and printing the result. First, get the default context and prepare defining a function:
arbb_context_t context;
arbb_get_default_context(&context, detail::throw_on_error_details());
arbb_type_t type;
arbb_get_scalar_type(context, &type, arbb_f32, detail::throw_on_error_details());
arbb_type_t inputs[] = { type, type };
arbb_type_t outputs[] = { type };
arbb_type_t fn_type;
arbb_get_function_type(context, &fn_type,
sizeof(outputs) / sizeof(*outputs), outputs,
sizeof(inputs) / sizeof(*inputs), inputs,
detail::throw_on_error_details());

The next step is to define the function. Please note, the indentation and the scope {} is just helping to better read the code and to keep some variables local to the block:
arbb_function_t function;
{
arbb_begin_function(context, &function, fn_type, "add", 0, detail::throw_on_error_details());
arbb_variable_t a, b, c;
enum { is_input, is_output };
arbb_get_parameter(function, &a, is_input, 0, detail::throw_on_error_details());
arbb_get_parameter(function, &b, is_input, 1, detail::throw_on_error_details());
arbb_get_parameter(function, &c, is_output, 0, detail::throw_on_error_details());

arbb_variable_t in[] = { a, b };
arbb_variable_t out[] = { c };
arbb_op(function, arbb_op_add, out, in, 0, detail::throw_on_error_details());
arbb_end_function(function, detail::throw_on_error_details());
}

Now, one can compile the defined function:
arbb_compile(function, detail::throw_on_error_details());

To execute the function we prepare some example input:
float data[] = { 25.0f, 7.0f };
arbb_binding_t null_binding;
arbb_set_binding_null(&null_binding);

arbb_global_variable_t ga, gb, gc;
arbb_create_constant(context, &ga, type, data + 0, 0, detail::throw_on_error_details());
arbb_create_constant(context, &gb, type, data + 1, 0, detail::throw_on_error_details());

arbb_variable_t in[2];
arbb_get_variable_from_global(context, in + 0, ga, detail::throw_on_error_details());
arbb_get_variable_from_global(context, in + 1, gb, detail::throw_on_error_details());

arbb_variable_t out[1];
arbb_create_global(context, &gc, type, "result", null_binding, 0, detail::throw_on_error_details());
arbb_get_variable_from_global(context, out + 0, gc, detail::throw_on_error_details());

To finally execute the function...
arbb_execute(function, out, in, detail::throw_on_error_details());

... and to print the result
float result = 0;
arbb_read_scalar(context, out[0], &result, detail::throw_on_error_details());
std::cout << "result: " << result << '\n';

Please note further, the "detail::throw_on_error_details()" isn't part of the VM API. It is just an example on how to get error details in form of an exception (using exceptions is also subject of C++). Feel free to pass "0" (or NULL), or to implement error handling in a more C-style manner. This is actually showing some design principle of the VM API: the API does not assume a specific language mechanism.

Many thanks to Noah and Hans for the pointers - I was working through the materials pointed at by Noah but struggling to invoke the example dot product function. Thank you Hans for the worked example!Bill

Great, let us know about your experience. My materials were more of an intro about *how* it works, Hans gave excellent step by step *how to*.

Would you repeat this kind of example with dense types? I'm infuriatingly close but missingsome detail(s).
Thank you!

Here is an example of performing an element-wise addition of two dense containers. For the matter of the error handling within in this example one may define one of the following macros:
#define ERROR_DETAILS arbb::detail::throw_on_error_details() // will throw exceptions
#define ERROR_DETAILS 0 // no detailed error handling

First, get the default context and prepare defining a function:
arbb_context_t context;
arbb_get_default_context(&context, ERROR_DETAILS);

arbb_type_t scalar_type;
arbb_get_scalar_type(context, &scalar_type, arbb_f32, ERROR_DETAILS);
arbb_type_t dense_type;
unsigned int ndim = 1; // dimensionality of the dense containers
arbb_get_dense_type(context, &dense_type, scalar_type, ndim, ERROR_DETAILS);

arbb_type_t inputs[] = { dense_type, dense_type };
arbb_type_t outputs[] = { dense_type };

arbb_type_t fn_type;
arbb_get_function_type(context, &fn_type,
sizeof(outputs) / sizeof(*outputs), outputs,
sizeof(inputs) / sizeof(*inputs), inputs,
ERROR_DETAILS);

The next step is to define the function. Please note, the indentation
and the scope {} is just helping to better read the code and to keep
some variables local to the block:

arbb_function_t function;
{
arbb_begin_function(context, &function, fn_type, "add", 0, ERROR_DETAILS);
arbb_variable_t parameter[3];
enum { is_input, is_output };
arbb_get_parameter(function, parameter + 0, is_input, 0, ERROR_DETAILS);
arbb_get_parameter(function, parameter + 1, is_input, 1, ERROR_DETAILS);
arbb_get_parameter(function, parameter + 2, is_output, 0, ERROR_DETAILS);

arbb_variable_t in[] = { parameter[0], parameter[1] };
arbb_variable_t out[] = { parameter[2] };
// Note: ArBB Beta 3 and earlier versions do not have the arbb_attribute_map_t* argument, i.e.
// arbb_op(function, arbb_op_add, out, in, 0, ERROR_DETAILS);
arbb_op(function, arbb_op_add, out, in, 0, 0, ERROR_DETAILS);
arbb_end_function(function, ERROR_DETAILS);
}

Now, one can compile the function which is defined above:
arbb_compile(function, ERROR_DETAILS);

To execute the function we prepare some example input. In this variant (OPTION A) we bind buffers which are already allocated to empty dense containers:
std::size_t size = 1024;
uint64_t pitches[] = { sizeof(float) };
arbb_binding_t binding[3];

// buffers already exists and will be bound to empty dense containers
std::vector data_a(size), data_b(size), data_c(size);

uint64_t sizes[] = { size };
arbb_create_dense_binding(context, binding + 0, &data_a[0], ndim, sizes, pitches, ERROR_DETAILS);
arbb_create_dense_binding(context, binding + 1, &data_b[0], ndim, sizes, pitches, ERROR_DETAILS);
arbb_create_dense_binding(context, binding + 2, &data_c[0], ndim, sizes, pitches, ERROR_DETAILS);

arbb_global_variable_t global[3];
arbb_create_global(context, global + 0, dense_type, 0, binding[0], 0, ERROR_DETAILS);
arbb_create_global(context, global + 1, dense_type, 0, binding[1], 0, ERROR_DETAILS);
arbb_create_global(context, global + 2, dense_type, "result", binding[2], 0, ERROR_DETAILS);

arbb_variable_t in[2] = {};
arbb_get_variable_from_global(context, in + 0, global[0], ERROR_DETAILS);
arbb_get_variable_from_global(context, in + 1, global[1], ERROR_DETAILS);

arbb_variable_t out[1] = {};
arbb_get_variable_from_global(context, out + 0, global[2], ERROR_DETAILS);

For the matter of this example, let's make sure the input buffers are initialized (maybe with values we can quickly verify):
for (std::size_t i = 0; i != size; ++i) {
data_a[i] = static_cast(i);
data_b[i] = static_cast(size - i);
}

Now let's execute the function...
arbb_execute(function, out, in, ERROR_DETAILS);

... and print the result (can be quite a lot of console output):
// data.begin() -> data.end(): we use pointers here in preparation of "OPTION B"
std::copy(&data_c[0], &data_c[0] + size, std::ostream_iterator(std::cout, " "));

... and we do some cleanup:
for (std::size_t i = 0, end = sizeof(binding) / sizeof(*binding); i != end; ++i) {
arbb_free_binding(context, binding[i], ERROR_DETAILS);
}

Please note, there are 'arbb_*_to_refcountable' functions which are simplifying to share resources.

As an alternative ("OPTION B") one can replace the block marked with "OPTION A" (shown in the answer above) using this code. The code here employs another way to get data into dense containers:
std::size_t size = 1024;
uint64_t pitches[] = { sizeof(float) };
arbb_binding_t binding[3];

for (std::size_t i = 0, end = sizeof(binding) / sizeof(*binding); i != end; ++i) {
arbb_set_binding_null(&binding[i]);
}

arbb_global_variable_t global[3];
arbb_create_global(context, global + 0, dense_type, 0, binding[0], 0, ERROR_DETAILS);
arbb_create_global(context, global + 1, dense_type, 0, binding[1], 0, ERROR_DETAILS);
arbb_create_global(context, global + 2, dense_type, "result", binding[2], 0, ERROR_DETAILS);

arbb_variable_t in[2] = {};
arbb_get_variable_from_global(context, in + 0, global[0], ERROR_DETAILS);
arbb_get_variable_from_global(context, in + 1, global[1], ERROR_DETAILS);

arbb_variable_t out[1] = {};
arbb_get_variable_from_global(context, out + 0, global[2], ERROR_DETAILS);

arbb_function_t global_scope;
arbb_set_function_null(&global_scope);

arbb_type_t size_type;
arbb_get_scalar_type(context, &size_type, arbb_usize, ERROR_DETAILS);
arbb_global_variable_t constant;
arbb_create_constant(context, &constant, size_type, &size, 0, ERROR_DETAILS);

std::vector dims(ndim);
arbb_get_variable_from_global(context, &dims[0], constant, ERROR_DETAILS);

arbb_op_dynamic(global_scope, arbb_op_alloc, 1, in + 0, ndim, &dims[0], 0, 0, ERROR_DETAILS);
arbb_op_dynamic(global_scope, arbb_op_alloc, 1, in + 1, ndim, &dims[0], 0, 0, ERROR_DETAILS);
arbb_op_dynamic(global_scope, arbb_op_alloc, 1, out + 0, ndim, &dims[0], 0, 0, ERROR_DETAILS);

void* mapping[3] = {};
arbb_map_to_host(context, in[0], &mapping[0], pitches, arbb_write_only_range, ERROR_DETAILS);
arbb_map_to_host(context, in[1], &mapping[1], pitches, arbb_write_only_range, ERROR_DETAILS);
arbb_map_to_host(context, out[0], &mapping[2], pitches, arbb_read_only_range, ERROR_DETAILS);

float* data_a = static_cast(mapping[0]);
float* data_b = static_cast(mapping[1]);
float* data_c = static_cast(mapping[2]);

This "OPTION B" is allocating the dense containers within the VM, and then mapping the memory region "back to the host" (read_only, write_only, read_write). This is reminding us, that the VM API is providing an abstraction for "remote execution". Remote execution is enabling ArBB for example to use "accelerators" for highly parallel workloads such as Intel MIC architecture (see FAQ). Allocating the containers right in the VM releases one to think about memory alignment (alignment.hpp). Please note, the binding interface ("OPTION A") as well as the memory mapping interface ("OPTION B") allows to specify the pitch or data-stride.

Wow, it seems I was not as close as I thought. Thank you!
- paul

The first public draft of the Intel ArBB VM API specification is now posted to our documentation page athttp://software.intel.com/en-us/articles/intel-array-building-blocks-documentation/.FYI,--Amanda

Laisser un commentaire

Veuillez ouvrir une session pour ajouter un commentaire. Pas encore membre ? Rejoignez-nous dès aujourd’hui