# Basic Components of the MI Method

- Expectation Maximization (EM) algorithm that computes the start point for the Data Augmentation (DA) algorithm
- Simulation-based DA function that uses Intel® MKL random number generators

`. The package format is as follows:p+p(p+1)/2`

- Fori=0,..,p-1,init_estimates[i]contains the start estimate of means.
- The remaining positions of the array store the upper-triangular part of the variance-covariance matrix.

`andμ`

`:Σ`

`,μ0`

`,τ`

`, andm`

`. As the DA function uses inverted matrixΛ-1`

`, the MI algorithm expects the inverse ofΛ`

`. These parameters are packed as a one-dimensional arrayΛ`

`to hold all the parameters. The storage format is as follows:(p2 + 3p +4)/2`

- prior[0],..., prior[p-1]contain elements of vector
.μ0 - prior[p]contains parameter
.τ - prior[p+1]contains parameter
.m - The remaining positions contain the upper-triangular part of the inverted matrix
.Λ-1

is set to an array ofμ0zeros.pis set to 0.τis set tom.p- for the initial approximate of
, the zero matrix is used.Λ-1

`sets of imputed values and/or a sequence of parameter estimates drawn during the DA procedure. The imputed values are returned as a single arraym`

`sets, each of sizem`

`*m`

**Example:**

`=10, where the second vector misses values for the first and the second variables, and the seventh observation misses the first point. The number of sets to impute isn`

`=2. Thus,m`

`m*da_iter_num*(p + (p2 + p)/2)`

is the number of sets of values to imputem- da_iter_numis the number of DA iterations
is the size of the memory to hold one set of the parameter estimates.p + (p2 + p)/2

- The vector of means occupies the first
positions.p - The upper-triangular part of the variance-covariance matrix occupies the remaining
positions.(p2 + p)/2

- VSL_SS_DNAN, if the dataset is stored in double precision floating-point arithmetic
- VSL_SS_SNAN, if the dataset is stored in single precision floating-point arithmetic

`sets of imputed values, you can place them in cells of the data matrix with missing values and use other Summary Statistics algorithms to analyze and compute estimates for each of them`

`complete datasets.m`

#include "mkl_vsl.h" #define DIM 3 /* dimension of the task */ #define N 10000 /* number of observations */ #define M VSL_SS_MI_PARAMS_SIZE /* number of MI parameters */ #define M_VALUE 9 /* total number of missing values */ #define M_COPIES 5 /* number of sets of imputed values */ int main() { int status; VSLSSTaskPtr task; MKL_INT i; double x[DIM * N]; /* matrix of observations */ double W[2]; double mean[DIM], r2m[DIM], c2m[DIM]; MKL_INT p, n, xstorage; double em_iter_num, da_iter_num, em_accuracy, copy_num, missing_value_num; double params[M], simul_missing_vals[M_VALUE * M_COPIES]; MKL_INT nparams = M, simul_missing_val_n = M_VALUE * M_COPIES; /* Pre-process the dataset and mark entries of missing values with VSL_SS_DNAN */ ... /* Parameters of the task and initialization */ p = DIM; n = N; xstorage = VSL_SS_MATRIX_STORAGE_ROWS; /* Parameters of the MI algorithm */ em_iter_num = 100; da_iter_num = 10; em_accuracy = 0.001; copy_num = M_COPIES; missing_value_num = M_VALUE; params[0] = em_iter_num; params[1] = da_iter_num; params[2] = em_accuracy; params[3] = copy_num; params[4] = missing_value_num; /* Create a task */ status = vsldSSNewTask( &task, &p, &n, &xstorage, x, 0, 0 ); /* Initialize the task parameters */ status = vsldSSEditMissingValues( task, &nparams, params, 0, 0, 0, 0, &simul_missing_val_n, simul_missing_vals, 0, 0 ); /* Generate m_values copies of missing value sets */ status = vsldSSCompute(task, VSL_SS_MISSING_VALS, VSL_SS_METHOD_MI ); /* Use the task to analyze the complete datasets */ W[0] = 0.0; W[1] = 0.0; for ( i = 0; i < p; i++ ) { mean[i] = 0.0; r2m[i] = 0; c2m[i] = 0.0; } status = vsldSSEditTask( task, VSL_SS_ED_ACCUM_WEIGHT, W ); status = vsldSSEditMoments( task, mean, r2m, 0,0, c2m, 0, 0 ); for ( i = 0; i < M_COPIES; i++ ) { /* Perform imputation of the next set of simulated values into x */ ... /* Compute the mean and the variance using the fast method */ errcode = vsldSSCompute(task, VSL_SS_MEAN|VSL_SS_2C_MOM, VSL_SS_METHOD_FAST ); /* Analyze the computed estimates */ ... } /* Deallocate the task resources */ status = vslSSDeleteTask( &task ); return 0; }