LIfe Sciences

Intel® Summary Statistics Library: how to deal with missing observations?

Real life datasets can have missing values. Sociological surveys and measurement of complex biological systems are two examples where the researcher can arrive at the point in which he should do something with missing observations. One can also treat outliers in datasets as samples which are also lost. Intel® Summary Statistics Library already contains functionality to detect outliers or get robust estimates in presence of “suspicious” observations.

Intel® Summary Statistics Library: what is new in the Update?

Intel® Summary Statistics Library 1.0 Update is available for downloading. It includes several features and benefits:

Algorithm for parameterization of correlation matrix. The algorithm transforms the input which lacks property of positive semidefiniteness into the output meeting properties of correlation matrix. The algorithm is based on spectral decomposition method and can be used in financial computations.

Intel® Summary Statistics Library: why not to use multi-core advantages?

In my previous posts I described some features and usage model of Intel® Statistics Library. However, there are many available statistical packages that provide good similar functionality. Does Intel® Summary Statistics Library deliver difference, bring something new and specific? The answer is yes. 

Intel® Summary Statistics Library: how to detect outliers in datasets?

Earlier I computed various statistical estimates like mean or variance-covariance matrix using Intel® Summary Statistics Library. In those cases I knew for sure that my datasets did not contain “bad” observations (points which do not belong to the distribution which I observed) or outliers. However, in some cases we need to deal with datasets which are contaminated with outliers.

Intel® Summary Statistics Library: how to process data in chunks?

In my previous post I considered computation of statistical estimates for in-memory datasets using tools available in Intel® Summary Statistics Library. New days bring new problems, and today I need to compute the same estimates for data which can not fit into memory of a computer.

Intel® Summary Statistics Library: Several Estimates at One Stroke

Today it was necessary for me to compute statistical estimates for a dataset. The observations are weighted, and only several components of the random vector had to be analyzed. How often do we solve such tasks and how do we solve them in our every day life? If we meet such problems rarely or their size is small then use of a popular statistical package or development of a data processing program will be a proper way to address the problem. What if I need to process huge data arrays regularly analyzing gene expression levels for example?

订阅 LIfe Sciences