by Abinasha Karana, Director
Bizosys Technologies Pvt Ltd, Bangalore
Traditionally, web scale is achieved by master-slave replication and data sharing, which can be a huge challenge as data scales beyond 500Gb. Map-Reduce based technologies such as HBase* and Hadoop* achieve web scale and address this challenge by transparent partitioning, distribution and replication. Also, most of the currently available recommended HBase and Hadoop server configurations are targeted towards write intensive applications.
Bizosys Technologies* has built a sSearch engine whose index is on Hadoop and HBase to deploy in a cluster environment. Search applications by nature involve read intensive operations. Bizosys experimented with its search engine that involved use of latest hardware options, software configuration and cluster deployment provisioning. These tests were conducted on Intel hardware at the Intel® Innovation Labs in June 2010.
Bizosys search engine provides a single window to content and applications in an enterprise. Not only does it find relevant information, it presents in visually rich views with integration to relevant application as actions embedded inside the results. This dynamic portal is built on-the-fly from the user's search.
Bizosys search engine was tested in a cluster and standalone environments. The cluster consisted of homogenous machines. Refer to the deployment architecture and machine specifications below. The Intel® Xeon® processor 5600 series was used for deployment.
Three broad test scenarios were explored, viz.
TEST SCENARIO - I/O optimization using SSD, caching and compression
In a read intensive operation, CPU I/O WAIT is the biggest bottleneck. We resolved this by using:
SSD for a higher throughput (Read only)
To find out the impact of disks we experimented using:
In a test environment consisting of:
SSD performed better than SATA. The performance gain was significant with increased number of concurrent users reducing the CPU I/O WAIT timings. One disk vs. multi disks gave an overall better response time for the queries.
Currently Bizosys search engine keeps data in two buckets:
The Index section data access is much higher. A provision for SSD disk mounting for this section would be a better design consideration from I/O usage view point.
The current SSDs come with capacities of 60GB, 120GB and so on. The cost of SSDs is higher than the regular SATA disks for same storage capacity. Mixing SATA and SSD together will be a better value proposition from a cost perspective.
SSD for a higher throughput (Simultaneous Read Write)
Bizosys Search is designed to add, update and delete information in a real-time environment. This empowers Bizosys Search to serve queries while the index is undergoing changes (Additions and 6 Modifications). In the software stack Hadoop, HBase and Bizosys search engine are designed to handle simultaneous Read-Write.
This test benchmarks compares
In a test environment consisting of:
Both the read and Simultaneous read write response timings are almost same. Bizosys Search Index can undergo modifications while serving the queries with no impact to the query response time. However, we saw a slightly improved response for Simultaneous Read-Write case. This is possibly due to Write operation having loaded index to the memory. This could be critical in an enterprise context, where the permissions and content undergo changes very frequently.
Compression and In-memory Caching
Hadoop and HBase provides two ways to decongest I/O
Compressing the Index: Compression minimizes the data resident size. However each time loading data blocks requires decompressing which is a CPU intensive task. This can slowdown the response time in situations where content undergoes frequent changes. This is an additional overhead where the index size and concurrent clients are not sufficient to create a CPU I/O WAIT.
HBase and Hadoop provide zlib, lzo compression algorithms. In addition to these, we compiled Hadoop native code linking Intel® Integrated Performance Primitives (Intel® IPP) ZLib libraries to leverage multi-processors during decompression.
Bizosys Search "Preview data blocks" are compressed by default. These Preview data blocks take up considerable storage capacity, while very few of these are read per search.
Caching: Clears disk bottleneck by offloading I/O WAIT to RAM memory. The complexity arises due to limited amount of RAM relative to index size. HBase manages the memory using Eviction algorithm. However, a constant swapping to work on the constrained RAM will slow down response time. Sufficient RAM should be provided to avoid swapping.
We tested Bizosys search engine query response time on a compressed and cached index. The test environment consisted of:
Decompression increases the CPU usage. However, last test indicated a 98% idle CPU; which implies there was sequential processing. These sequential processing zones consist of HBase Region scanner data block parsing.
In all our testing we have employed the block size as 32MB (dfs.block.size property in hdfs-site.xml file). However, with compression, more records got packed on a data block. Parsing decompressed bigger data block files sequentially, resulted in a poorer query response time and a higher CPU idle time.
TEST SCENARIO - Read optimized server settings
Hadoop, HBase and Bizosys Search are highly configurable systems.
The primary areas of tuning are:
Hadoop, HBase, Zookeeper and Bizosys Search and Test client instances ran in a single machine and indexed 1 Million Wikipedia documents. Readings were taken on various server settings on this environment.
We focused tuning the following areas to remove software bottlenecks viz.:
Some of the configurations carried out are:
|hdfs-site.xml||dfs.block.size||Lower value offers parallelism.||33554432|
|dfs.datanode.handler.count||Number of handlers dedicated to serve data block requests in hadoop DataNodes.||100|
|core-site.xml||io.file.buffer.size||This is the read and write buffer size. By setting limit to 16KB it allows continuous streaming.||16384|
|hbase-site.xml||hbase.regionserver.handler.count||RPC Server instances spun up on HBase RegionServers||100|
|hfile.min.blocksize.size||Small size increases the index but reduces the lesser fetch on a random access.||65536|
|default.xml||SCAN_IPC_CACHE_LIMIT||Number of rows cached in Bizosys search engine for each scanner next call over the wire. It reduces the network round trip by 300 times caching 300 rows in each trip.||300|
|LOCAL_JOB_HANDLER_COUNT||Number of parallel queries executed at one go. Query requests above than this limit gets queued up.||100|
Hadoop JVM instance was set to run with "-server" option with a HEAP size of 4GB. HBase nodes ran with 4GB heap with JVM settings
"-server -XX:+UseParallelGC -XX:ParallelGCThreads=4 -XX:+AggressiveHeap -XX:+HeapDumpOnOutOfMemoryError"
The parallel GC leverages multiple CPUs.
Custom Indexing on HBase
Bizosys Search has an HBase extension indexing module. This Index module is the first step for all HBase requests. It is a binary gate which deciphers the availability of the term inside the region. If it's not available, whole processing at HBase Region server is avoided.
This index is designed in an optimal way to consume minimal memory foot print while deciphering query term availability in 1-2 milliseconds.
The block cache setting was only at 20% (around 800Mb) of allocated HEAP. For this specific setting lot of cache eviction was noticed. To avoid this, it is recommended that more memory for caching.
With these recommended configuration settings, 100 concurrent users experienced an average response time of 46 sec while in the tuned environment it came down to 5sec. It can be further improved by increasing In-Memory caching and distributed client provisioning.
TEST SCENARIO - Optimal Cluster Deployment Provisioning
Hadoop and HBase designed to scale in a cluster environment. HBase is currently in use at several places including Bing, Yahoo's Cluster with 10000 machines (See: http://www.slideshare.net/teste999/distributed-databases-overview-2722599)
Our cluster had 1 master machine and 3 slave machines. Master machine ran Hadoop name node, Zookeeper and the HBase master. Each slave machines ran Hadoop Data nodes and HBase Region Server.
We primarily tested three scenarios:
Bizosys Search result quality is configurable. It takes extra payload to calculate user specific relevance ranking (dynamic) besides the regular weight based static ranking (E.g.. documents from same role as the searcher, company departmental proximity, location proximity). This finer refinement depends on knowing the user and presenting relevant information. However, it also involves accessing more meta information.
We tested for both of these scenarios
In a test environment consisting of:
In cluster environment average I/O WAIT is lower, however I/O WAIT spikes were noticed.
Bizosys search engine deployment in a distributed environment demonstrated better performance than on a single machine.
Intel® Xeon® processor 5600 series automatically regulates power consumption and intelligently adjusts server performance according to your needs. Enabled by Intel® Intelligent Power Technology, these processors shift into the lowest available power state, while Intel® Turbo Boost Technology intelligently adjusts performance according to your needs.
Designed to enable increased secure server deployments, this series now includes Advanced Encryption Standard New Instructions (AES-NI), providing hardware-based acceleration for secure transaction servers and with hardware-based Intel® Trusted Execution Technology (Intel® TXT), Intel® Xeon® processor 5600 series helps guard against malicious software attacks.
Intel® Xeon® processor 5600 feature overview and benefits
Dynamic scalability, managed cores, threads, cache, interfaces, and power for energy-efficient performance on-demand.
For more information on the Intel® Xeon® processor 5600 series, please visit http://www.intel.com/Assets/en_US/PDF/prodbrief/323501.pdf
The Intel® Extreme SATA Solid-State Drive (SSD) offers outstanding performance and reliability, delivering the highest IOPS per watt for servers, storage and high-end workstations. Enterprise applications place a premium on performance, reliability, power consumption and space. Unlike traditional hard disk drives, Intel Solid-State Drives have no moving parts, resulting in a quiet, cool storage solution that also offers significantly higher performance than traditional server drives. Imagine replacing up to 50 high-RPM hard disk drives with one Intel® X25-E Extreme SATA Solid-State Drive in your servers - handling the same server workload in less space, with no cooling requirements and lower power consumption. That space and power savings, for the same server workload, will translate to a tangible reduction in your TCO.
For product brief and more information on the product, please visit: http://download.intel.com/design/flash/nand/extreme/extreme-sata-ssd-product-brief.pdf
Achieving optimal results from a Hadoop and HBase based Bizosys Search engine begins with choosing appropriate combination of disk throughput, in-memory caching, cluster deployment and multi CPU box.
We found SSDs to be very effective for both read and write operation. In-memory caching resulted in better response through setting right amount of "HEAP CACHE" to achieve higher cache hit percentage. Cluster environment served the requests faster where as "CPU I/O WAIT" spikes were noticed. Overall most of the CPUs remain idle during the test.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804