Hardware Selection for a Low Cost Cluster

For this first phase of my project, I will be building a cluster on the cheap. My cost goal was $600 for a 4 node system, or $150 per node. The Raspberry Pi immediately comes to mind as an option. Indeed, there have been several projects where people turned a set of Raspberry Pis into a cluster. However, since my goal is to create a cluster for Spark, the size of the aggregate RAM pool of the cluster is important. The Raspberry Pi Model 3 only has 1 GB of RAM. So I explored whether there are other options with more RAM. I found the ODROID-XU4 single board computer, which has 2 GB of RAM, and an 8-core ARM CPU. Furthermore, each board is only $76 without storage. It also has some other nice features, such as: USB 3 ports HDMI output eMMC 5.0 hard drive connector MicroSD slot supporting the faster UHS-1 standard Onboard gigabit ethernet Serial console port One consideration is data storage. Read More …

DIY Big Data Project Goal

I have always been interested in large scale computing. This interest started back in graduate school when I was studying astrodynamics. Most of the problems you want to solve in astrodynamics requires numerical computation, as only the 2-body problem has a closed form solution. In order to solve anything more complex, you first have to make some simplifying assumptions (e.g., no other gravitational influences), and then you have a set of partial differential equations that will describe the motion of your system. The only way to use these equations to predict the position of the bodies was through numerical integration. In simple systems this sort of calculations was pretty straightforward, though you have to pay a lot of attention to round off error. However, when if you were trying to predict the progression of a debris field in space, which would require you to simultaneously project the motion of tens of thousands of objects, parallelized computation start to look attractive. Compute Read More …