Network Design for the Low Cost Cluster

Our first task in building any cluster is to first design how it will be set up, most notably how the nodes will interact with each other. The cluster we will be building will have 4 nodes, one master node and three slaves. Each node will be connected to each other by the ethernet switch. However, we want the node-to-node communication to be it’s own network. This maximizes the throughput in the node-to-node communication, which is important for distributed computation, and also makes the cluster behave more like a single device to the external network. The benefit of doing this is that we can add or remove nodes to the cluster without a client ever knowing. However, this approach does present some challenges with how an external client (e.g., your laptop) will interact with the data analysis software, such as Hadoop, but we will deal with that later. My goal is really to create a “data analysis appliance”, so requiring any client Read More …