Filters

Blog post

mahout 0.5 基于 hadoop 的 CF 代码分析

mahout的taste框架是协同过滤算法的实现。它支持DataModel,如文件、数据库、NoSQL存储等,也支持hadoop的MapReduce。这里主要分析mahout0.5中的基于MR的实现。

Authored by Last updated on 01/24/2019 - 16:00
Article

Hadoop 0.22.0 及其 RAID 部署

        使用0.20.X系列版本的Hadoop快有一年时间了,主要集中在HDFS上。期间自己参与了部署Hadoop集群(1 Server + 20 PC),也参与了分析HDFS的源码。

Authored by Last updated on 01/24/2019 - 16:00
Blog post

ubuntu 中安装 hadoop 记录

Hadoop 版本:hadoop-1.2.1-bin.tar

Jdk 版本:jdk-6u30-linux-i586

Authored by Last updated on 01/24/2019 - 16:00
Blog post

Restudy SchemaRDD in SparkSQL

At the very beginning, SchemaRDD was just designed as an attempt to make life easier for developers in their daily routines of code debugging and unit testing on SparkSQL core module. The idea can boil down to describing the data structures inside RDD using a formal description similar to the relational database schema. On top of all basic functions provided by common RDD APIs, SchemaRDD also...
Authored by Last updated on 06/14/2017 - 16:50
Article

What is Intel® DAAL?

The Intel® Data Analytics Acceleration Library (Intel® DAAL) is the library of Intel® Architecture optimized building blocks covering all data analytics stages: data acquisition from a data source, preprocessing, transformation, data mining, modeling, validation, and decision making. To achieve best performance on a range of Intel® processors, Intel DAAL uses optimized algorithms from the Intel®...
Authored by Vipin Kumar E K (Intel) Last updated on 10/08/2018 - 06:30
Article

How to Use Intel® DAAL in Java Applications

Intel® Data Analytics Acceleration Library (Intel® DAAL) provides a Java API and the ease-of-use for Java programmers. This article discusses how to build and run applications with the Eclipse IDE (one of the most popular Java IDEs). The procedures outlined in this article should also be applicable to other Java IDEs. If you want to build and run Java applications from the command line, see...
Authored by Zhang, Zhang (Intel) Last updated on 10/03/2018 - 07:24
Article

A Walk-Through of Distributed Processing Using Intel® DAAL

Intel® Data Analytics Acceleration Library (Intel® DAAL) is a new highly optimized library targeting data mining, statistical analysis, and machine learning applications. It provides advanced building blocks supporting all data analysis stages (preprocessing, transformation, analysis, modeling, decision making) for offline, streaming and distributed analytics usages. Intel DAAL support...
Authored by Ying H. (Intel) Last updated on 10/04/2018 - 04:16
Article

Intel® Parallel Computing Center at Georgia Institute of Technology

The Intel® Parallel Computing Center (Intel® PCC) on Big Data in Biosciences and Public Health is focused on developing and optimizing parallel algorithms and software on Intel® Xeon® Processor and Intel® Xeon Phi™ Coprocessor systems for handling high-throughput DNA sequencing data and gene expression data.
Authored by admin Last updated on 11/14/2017 - 08:27
Blog post

How Moscow Institute of Physics and Technology Rocketed the Development of Hypersonic Vehicles

The Moscow Institute of Physics and Technology (MIPT) Laboratory is focused on futuristic vehicles such as airplanes and spacecraft that travel at high speeds.

Authored by Sally Sams (Intel) Last updated on 03/21/2019 - 12:00
Article

Live Webinar: Boost Python* Performance with Intel® Math Kernel Library

Python* is a popular open-source scripting language known for its easy-to-learn syntax and active developer community.
Authored by Mike P. (Intel) Last updated on 06/07/2017 - 10:28