Hadoop – DIY Big Data

Adding a New Node to the ODROID XU4 Cluster

August 20, 2016August 28, 2016 michael

I recently acquired another ODROID XU4 device (and a MicroSD card for bulk storage) to add it to my XU4 cluster. This new node brings my node count to five. Adding a new node to the cluster is relatively straight forward, but there are a lot of details. In the modern datacenter, this operation would be accomplished through a package manager, which would build the new node according to an image. However, I haven’t set up package management on my cluster so I will need to sit up the new node manually. In the original cluster configuration, I used a 5-port ethernet switch for the internal network. Given that there was an open port, no additional hardware beyond the new node is needed. However, if I were to add a sixth node (or beyond), I would need to update my ethernet switch to something such as an 8-port switch. I will also note that using the 40 mm PCB spacers I originally ordered makes Read More …

Running the Word Count Job with Hadoop

July 2, 2016July 16, 2016 michael

Now that we have Hadoop up and running on the ODROID XU4 cluster, let’s take it for a spin. Every technical platform has a “Hello World” type project, and for Hadoop and other map-reduce platforms, it is Word Count. In fact, the Apache Hadoop MapReduce Tutorial uses Word Count to demonstrate how to write a Hadoop job. We are going to use that tutorial’s code. However, at the time of this writing the WordCount example on the Apache Hadoop site has a bug in it. I’ve corrected that bug and posted the updated code to my Github repository for the ODROID XU4 cluster project. Log into the master node as hduser. We need to update the environment variables so that you can easily build Hadoop jobs. Do this by editing the .bashrc file and add the following at the end: export JAVA_HOME=$(readlink -f /usr/bin/java | sed “s:bin/java::”).. export PATH=$PATH:/usr/local/hadoop/sbin:/usr/local/hadoop/bin:${JAVA_HOME}/bin export HADOOP_CLASSPATH=${JAVA_HOME}/lib/tools.jar Then grab the corrected Word Count job code from the Read More …

Mounting HDFS with NFS

June 26, 2016June 26, 2016 michael

After setting up the Hadoop installation on the ODROID XU4 cluster, we need to find a way to get data in and out of it. The traditional pattern used when a cluster is on it’s own network such as ours is is to have an edge node where the user logs into, transfers the data to that edge node, then put that data in HDFS from the edge node. Speaking from experience, this is annoyingly to much work. For my personal cluster, I want the HDFS file system to integrate with my Mac laptop. The most robust way to accomplish my goal with HDFS is to have it mounted as a NFS drive. The Hadoop distribution we are using has a NFS server built in. This server is run on the master node, effectively acting as a proxy between the HDFS cluster and the external network. The pros to this approach is that I get the usage paradigm that I want. Read More …

Installing Hadoop onto an ODROID XU4 Cluster

June 25, 2016August 19, 2016 michael

Now is the time when we start to see the fruits of our labor in getting the ODROID XU4 low cost cluster built. We will be installing Hadoop and configuring it to serve an NFS mount that can be mounted on your client computer (e.g., your laptop) to be able to interact with the HDFS file system as if it were another hard drive on your computer. This feature will greatly ease the use of our cluster, as it will minimize the need for a user to log into the cluster to use it. An NFS mount is not the only necessary facet of the cluster to enable the client usage vision, but it is an important one. Before we install Hadoop, let’s discuss what we are trying to accomplish by installing it. Hadoop has three components: the Hadoop File System (HDFS), Yarn, and Map-Reduce. For our purposes, we are most interested in HDFS, but we will play around with the Read More …

One byte at a time …

Tag: Hadoop

Adding a New Node to the ODROID XU4 Cluster

Running the Word Count Job with Hadoop

Mounting HDFS with NFS

Installing Hadoop onto an ODROID XU4 Cluster