Filtros

Mensajes en el blog

mahout 0.5 基于 hadoop 的 CF 代码分析

mahout的taste框架是协同过滤算法的实现。它支持DataModel,如文件、数据库、NoSQL存储等,也支持hadoop的MapReduce。这里主要分析mahout0.5中的基于MR的实现。

Autor Última actualización 24/01/2019 - 16:00
Article

Hadoop 0.22.0 及其 RAID 部署

        使用0.20.X系列版本的Hadoop快有一年时间了,主要集中在HDFS上。期间自己参与了部署Hadoop集群(1 Server + 20 PC),也参与了分析HDFS的源码。

Autor Última actualización 24/01/2019 - 16:00
Mensajes en el blog

Part #1 - Tuning Java Garbage Collection for HBase

Part #1 of a multi-parts post, we will take a look on how to tune Java garbage collection (GC) for HBase focusing on 100% YCSB reads. In part #2, we will look at 100% writes and finally in part #3, we will tune Java GC for a mix of 50/50 read/writes. As already mentioned, we are using YCSB which seems to be the de facto NoSQL workload. We wont go into much details on how to install, configure...
Autor Eric Kaczmarek (Intel) Última actualización 14/06/2017 - 16:10
Video

Intel Software Optimization of Java* Virtual Machine and OpenJDK Community Announcement (OOW '14)

At Oracle OpenWorld 2014, Michael Greene talks about the role of his organization in helping optimize the Java* Virtual Machine and Intel's announcement that it is joining the Java OpenJDK communit

Autor Última actualización 27/03/2019 - 14:04
Video

Big Data Java Optimization

This video provides overview of Java programming language and its benefits to enterprise applications.

Autor admin Última actualización 14/06/2017 - 08:55
Mensajes en el blog

Experience and Lessons Learned for Large-Scale Graph Analysis using GraphX

While GraphX provides nice abstractions and dataflow optimizations for parallel graph processing on top of Apache Spark*, there are still many challenges in app

Autor Mike P. (Intel) Última actualización 14/06/2017 - 15:44
Mensajes en el blog

Hadoop RPC机制+源码分析

 一、RPC基本原理

Autor Última actualización 03/07/2019 - 20:08
Article

Intel Keynote and Intel technical presentations at Spark Summit West 2015

To find new trends and strong patterns from large complex data sets, a strong analytics foundation is needed. Intel is working closely with Databricks, AMPLab, Spark community and its ecosystem to advance these analytics capabilities…
Autor Mike P. (Intel) Última actualización 07/06/2017 - 09:33
Article

Tuning Java* Garbage Collection for Spark* Applications

Spark is gaining wide industry adoption due to its superior performance, simple interfaces, and a rich library for analysis and calculation.

Autor Mike P. (Intel) Última actualización 10/05/2019 - 08:30
Article

A Mission-Critical Big Data Platform for the Real-Time Enterprise

As the volume and velocity of enterprise data continue to grow, extracting high-value insight is becoming more challenging and more important. Businesses that can analyze fresh operational data instantly—without the delays of traditional data warehouses and data marts—can make the right decisions faster to deliver better outcomes.
Autor Nguyen, Khang T (Intel) Última actualización 06/07/2019 - 16:40