# LIfe Sciences

# The switch() statement isn't really evil, right?

In my current position, I work to optimize and parallelize codes that deal with genomic data, e.g., DNA, RNA, proteins, etc. To be universally available, many of the input files holding DNA samples (called *reads*) are text files full of the characters 'A', 'C', 'G', and 'T'.

# Intel® Summary Statistics Library: how to deal with missing observations?

Real life datasets can have missing values. Sociological surveys and measurement of complex biological systems are two examples where the researcher can arrive at the point in which he should do something with missing observations. One can also treat outliers in datasets as samples which are also lost. Intel® Summary Statistics Library already contains functionality to detect outliers or get robust estimates in presence of “suspicious” observations.

# Intel® Summary Statistics Library: how to use the robust methods?

Intel® Summary Statistics Library provides several opportunities for processing the datasets “contaminated” with outliers. Earlier I demonstrated how to detect “suspicious” observations in the dataset.

# Intel® Summary Statistics Library: how fast is the algorithm for detection of outliers?

In one of my previous posts I described the scheme for detection of outliers in datasets which is important component of the Intel® Summary Statistics Library.

# Intel® Summary Statistics Library: what is new in the Update?

**Algorithm for parameterization of correlation matrix**. The algorithm transforms the input which lacks property of positive semidefiniteness into the output meeting properties of correlation matrix. The algorithm is based on spectral decomposition method and can be used in financial computations.

# Intel® Summary Statistics Library: why not to use multi-core advantages?

In my previous posts I described some features and usage model of Intel® Statistics Library. However, there are many available statistical packages that provide good similar functionality. Does Intel® Summary Statistics Library deliver difference, bring something new and specific? The answer is yes.

# Intel® Summary Statistics Library: how to detect outliers in datasets?

Earlier I computed various statistical estimates like mean or variance-covariance matrix using Intel® Summary Statistics Library. In those cases I knew for sure that my datasets did not contain “bad” observations (points which do not belong to the distribution which I observed) or outliers. However, in some cases we need to deal with datasets which are contaminated with outliers.

# Intel® Summary Statistics Library: how to process data in chunks?

# Intel® Summary Statistics Library: Several Estimates at One Stroke

Today it was necessary for me to compute statistical estimates for a dataset. The observations are weighted, and only several components of the random vector had to be analyzed. How often do we solve such tasks and how do we solve them in our every day life? If we meet such problems rarely or their size is small then use of a popular statistical package or development of a data processing program will be a proper way to address the problem. What if I need to process huge data arrays regularly analyzing gene expression levels for example?

- 第 1 页
- ››