Article

Preparing for Parallel Optimization

Optimizing your application for multi-core technology can result in big performance improvements, but it requires a plan of action that is well suited to your application. This article gives an overview of key steps to follow as you optimize your code.
Criado por Diana B. (Intel) Última atualização em 04/07/2019 - 22:00
Article

A Brief Survey of NUMA (Non-Uniform Memory Architecture) Literature

This document presents a list of articles on NUMA (Non-uniform Memory Architecture) that the author considers particularly useful. The document is divided into categories corresponding to the type of article being referenced. Often the referenced article could have been placed in more than one category. In this situation, the reference to the article is placed in what the author thinks is the...
Criado por Última atualização em 15/10/2019 - 15:22
Article

Modern Memory Subsystems Benefits for Data Base Codes, Linear Algebra Codes, Big Data, and Enterprise Storage

This article describes and contrasts advantages different types of memory, including Multi-Channel DRAM (MCDRAM) and High-Bandwidth Memory (HBM), the future 3D XPoint™ memory devices, and Intel® Omni-Path Fabric (Intel® OP Fabric).
Criado por Última atualização em 30/09/2019 - 17:28
Mensagem de blog

Debug Intel® Transactional Synchronization Extensions

If printf or fprintf functions cause transaction aborts, use Intel® Processor Trace as a work-around.
Criado por Roman Dementiev (Intel) Última atualização em 04/07/2019 - 17:00
Article

Putting Your Data and Code in Order: Data and layout - Part 2

Apply the concepts of parallelism and distributed memory computing to your code to improve software performance. This paper expands on concepts discussed in Part 1, to consider parallelism, both vectorization (single instruction multiple data SIMD) as well as shared memory parallelism (threading), and distributed memory computing.
Criado por David M. Última atualização em 12/03/2020 - 23:40
Article

整理您的数据和代码: 数据和布局 - 第 2 部分

Apply the concepts of parallelism and distributed memory computing to your code to improve software performance. This paper expands on concepts discussed in Part 1, to consider parallelism, both vectorization (single instruction multiple data SIMD) as well as shared memory parallelism (threading), and distributed memory computing.
Criado por David M. Última atualização em 12/03/2020 - 23:40
Article

Choosing the right threading framework

This is the second article in a series of articles about High Performance Computing with the Intel Xeon Phi.

Criado por Última atualização em 12/03/2020 - 23:40
Article

Приводим данные и код в порядок: данные и разметка, часть 2

In this pair of articles on performance and memory covers basic concepts to provide guidance to developers seeking to improve software performance. This paper expands on concepts discussed in Part 1, to consider parallelism, both vectorization (single instruction multiple data SIMD) as well as shared memory parallelism (threading), and distributed memory computing.
Criado por David M. Última atualização em 12/03/2020 - 23:40
Article

Improve Performance with Vectorization

This article focuses on the steps to improve software performance with vectorization. Included are examples of full applications along with some simpler cases to illustrate the steps to vectorization.
Criado por David M. Última atualização em 12/03/2020 - 23:40
Article

Особенности оптимизации вычислений в прикладных программах на языке С на примере оценивания опционов европейского типа

С.И. Бастраков, Р.В. Донченко, И.Б. Мееров, А.Н. Половинкин Нижегородский государственный университет им. Н.И. Лобачевского
Criado por Última atualização em 19/03/2020 - 23:30