Filters

Blog post

Looking at big data performance

My very first Intel blog entry!! Exciting!!
Authored by Eric Kaczmarek (Intel) Last updated on 06/14/2017 - 15:43
Blog post

Part #1 - Tuning Java Garbage Collection for HBase

Part #1 of a multi-parts post, we will take a look on how to tune Java garbage collection (GC) for HBase focusing on 100% YCSB reads. In part #2, we will look at 100% writes and finally in part #3, we will tune Java GC for a mix of 50/50 read/writes. As already mentioned, we are using YCSB which seems to be the de facto NoSQL workload. We wont go into much details on how to install, configure...
Authored by Eric Kaczmarek (Intel) Last updated on 06/14/2017 - 16:10
Blog post

Hands-on Hive-on-Spark in the AWS Cloud

by Brock Noland (Cloudera), Na Yang (MapR), and Rui Li (Intel)

 

Authored by Last updated on 06/14/2017 - 15:43
Blog post

Apache Spark* Innovation: Driving a Stronger Community Standard

This blog post was jointly written by Jiangang Duan, Jie Huang and Weihua Jiang (Intel), Alex Gutow (Cloudera), and Dale Kim (MapR)

 

Authored by Last updated on 03/11/2019 - 13:17
Blog post

Unlocking Big Data with Open Source Solutions – Intel® Chip Chat episode 368

Ziya Ma, Director of Big Data Technologies at Intel, stops by to talk about how open source solutions are enabli

Authored by Mike P. (Intel) Last updated on 06/14/2017 - 15:44
Blog post

Experience and Lessons Learned for Large-Scale Graph Analysis using GraphX

While GraphX provides nice abstractions and dataflow optimizations for parallel graph processing on top of Apache Spark*, there are still many challenges in app

Authored by Mike P. (Intel) Last updated on 06/14/2017 - 15:44
Blog post

Ceph Erasure Coding Introduction

Ceph introduction
Authored by Yuan Zhou (Intel) Last updated on 06/14/2017 - 15:45
Blog post

Restudy SchemaRDD in SparkSQL

At the very beginning, SchemaRDD was just designed as an attempt to make life easier for developers in their daily routines of code debugging and unit testing on SparkSQL core module. The idea can boil down to describing the data structures inside RDD using a formal description similar to the relational database schema. On top of all basic functions provided by common RDD APIs, SchemaRDD also...
Authored by Last updated on 06/14/2017 - 16:50
Blog post

Webinar: Apache Spark* and Big Data Analytics: Solving Real-World Problems

Big Data analysis is having an impact on every industry today. Industry leaders are capitalizing on these new business insights to drive competitive advantage.

Authored by Mike P. (Intel) Last updated on 05/08/2018 - 11:26