开源 - OpenStack

Autor Última actualización 13/07/2018 - 14:32

Hadoop 0.22.0 及其 RAID 部署

        使用0.20.X系列版本的Hadoop快有一年时间了,主要集中在HDFS上。期间自己参与了部署Hadoop集群(1 Server + 20 PC),也参与了分析HDFS的源码。

Autor Última actualización 24/01/2019 - 16:00

大数据: 请认真对待


Autor Shen Zhou (Intel) Última actualización 05/07/2019 - 14:15

Intel Keynote and Intel technical presentations at Spark Summit West 2015

To find new trends and strong patterns from large complex data sets, a strong analytics foundation is needed. Intel is working closely with Databricks, AMPLab, Spark community and its ecosystem to advance these analytics capabilities…
Autor Mike P. (Intel) Última actualización 07/06/2017 - 09:33

Intel® Parallel Computing Center at Georgia Institute of Technology

The Intel® Parallel Computing Center (Intel® PCC) on Big Data in Biosciences and Public Health is focused on developing and optimizing parallel algorithms and software on Intel® Xeon® Processor and Intel® Xeon Phi™ Coprocessor systems for handling high-throughput DNA sequencing data and gene expression data.
Autor admin Última actualización 14/11/2017 - 08:27

Installing Apache Zeppelin* on Cloudera Distribution of Hadoop*

Apache Zeppelin* is a new web-based notebook that enables data-driven, interactive data analytics, and visualization with the added bonus of supporting multiple languages, including Python*, Scala*, Spark SQL, Hive*, Shell, and Markdown. Zeppelin also provides Apache Spark* integration by default, making use of Spark’s fast in-memory, distributed, data processing engine to accomplish data science...
Autor Última actualización 07/06/2017 - 10:40

Tuning Java* Garbage Collection for Spark* Applications

Spark is gaining wide industry adoption due to its superior performance, simple interfaces, and a rich library for analysis and calculation.

Autor Mike P. (Intel) Última actualización 10/05/2019 - 08:30

Indexing DICOM* Images on Cloudera Hadoop* Distribution

This paper show how to replicate the proof point, to index DICOM images for storage, management, and retrieval on a Cloudera Hadoop* cluster, using open source software components.
Autor Última actualización 22/02/2019 - 16:10

Major UK retailer gets customer visibility to enable growth

Marks & Spencer Develops Next Generation Analytics Capabilities with Cloudera Leading UK Retailer Builds 360-Degree Customer View, Improves Attribution Modeling, and Brings Analytics Capability

Autor Mike P. (Intel) Última actualización 14/06/2017 - 13:15

Intel and Cloudera Reduce Insurance Fraud and Dramatically Improve Time to Access Claim Data

A major insurance company maintains over 60 years of adjuster claim notes in structured and unstructured forms.

Autor Mike P. (Intel) Última actualización 07/06/2017 - 10:27