使用0.20.X系列版本的Hadoop快有一年时间了，主要集中在HDFS上。期间自己参与了部署Hadoop集群(1 Server + 20 PC)，也参与了分析HDFS的源码。
Welcome to the RTFB (Reaching Technology From Blogs) Episode 8 Blog. Guests on RTFB are given an opportunity to talk about and promote their blogs. Thai Le joins us this time to talk about two of his blogs. For the first blog, Thai talks about how to speed up a cloud environment and workload performance on Intel® Architecture. In his second blog, Thai talks about considerations for tuning an Intel Xeon Linux/Apache server.
I was at Hive user group meetup NYC the night before Strata + Hadoop World 2012, and talked about “SQL (92 and beyond) Support for Hive”, such as:
In the last several years, we have been working closely with our users and customers on their next-gen data analytics platforms using Hadoop and HBase. While the Hadoop stack has laid a solid foundation for these systems, we are still required to implement many new capabilities in building a flexible and efficient analytics platform; and “Project Panthera” is our open source efforts to enable these new analytics capabilities on Hadoop/HBase.
As any good engineer knows, “if you cannot measure it, you cannot improve it.” And a representative benchmark suite is the key for measuring any computer systems. That’s exactly why we have constructed HiBench, a Hadoop benchmark suite consisting of both micro-benchmarks and real world applications, including:
With the recent Hadoop World event hosted by Cloudera on October 2, 2009, Cloudera and Hadoop have been getting quite a bit of attention from the media, and the visibility for open source software in the cloud has increased along with them. I didn't attend the Hadoop World event, but I heard that it was well attended with solid content.