Three years ago, I worked through a project of creating a low cost, low power computer cluster, with the primary goal of becoming more familiar with the inner works of Apache Spark. This project did accomplish that goal, but since this cluster was made up of 32-bit ARM processors and each had only 2 GB of RAM, the cluster was not too useful for getting meaningful work done. What it did excel at, however, was showing the user how to write Spark code efficiently. If you wanted to get anything done on this constrained system you had to be mindful of every inefficiency.
Jump ahead to 2019, and I have decided to give it another go. This time, I want to make a cluster that is moderately useful for data analysis and machine learning, but does not break the bank either. The first step is to document my system design requirements and general goals:
- The cluster’s primary purpose will be to run Apache Spark to do data analysis. However, I would also like to experiment with ElasticSearch.
- The cluster should be usable for data sets up to 256 GB or more in size, and have have storage for a few terabytes of data.
- The cluster should use 64 bit x86 CPUs. My last cluster was based on 32 bit ARM CPUs. This created some compatibility issues that were interesting to sort through, but I don’t want to deal with those sort of problems this time.
- Building, configuring, and operating this cluster should facilitate the following learning objectives:
- Understand the hardware tradeoffs for performance in distributed computing systems.
- Advance my understanding and skill in Apache Spark
- Learn how to run Apache Spark in Kubernetes.
- Experiment with HDFS 3.x and compare it to QFS 2.x.
- My target budget is $3,000, and I would prefer to come in under that but would go over if there is value in it.
- I want at least four physical nodes.
Probably the biggest driver in my hardware selection is my budget. I would love to get some of the new server-class motherboards. For example, SuperMicro’s new motherboard basedon AMD’s EPYC 3251 CPU particularly caught my eye, but building out a single cluster node that reasonably leverages the features of the SuperMicro motherboard resulted in a per node code of well over $2,000. That would be most of my budget for a single node. Given my goal of have a distributed hardware environment, I would have to scale back my expectation on per-node capabilities.
I did a lot of searching for this. I considered doing my own node builds, looked at low cost computers from Walmart, and even looked at some of the more powerful single board computers, such as the UDOO Bolt. Given my general project goals, the technical parameters I looked at were:
- CPU Cores and Threads – The more threads the better for each node. This would allow a higher level of parallelism within the cluster. It would also allow better handling of running the distributed file system across the same nodes that spark is running on.
- CPU Speed – Ultimately I want this cluster to be useful and more than just a learning vehicle like my last cluster was. So I considered the CPU’s benchmark when differentiating between choices.
- Maximum RAM Capacity – Spark works best with the more RAM it has available to it, so you want to have a high RAM to thread count ratio.
- NVMe SSD – Even though Spark leverages RAM as much as possible to make calculations fast, it still needs to spill data to disk in order to manage operations. Ideally, the isa it spills data to is fast. Furthermore, the distributed file system’s performance is related to the performance of its storage media. NVMe SSDs allow high performance at a relatively cheap price (1 TB NVMe SSDs can be found for about $100).
- Small Form Factor – I don’t have a data center to put this cluster into, just my desk in my home office. I want to keep it’s footprint low.
- Future Expandability – The two facets of expandability that are important here are more storage and faster networking. Ideally the node type I select would allow me to upgrade storage and networking if I wish to invest more money into the cluster.
At one point I was seriously considering the UDOO Bolt. It seemed to check all the qualities I was looking for in a node computer. Its eight thread AMD V1605B CPU seemed to provide strong performance for relatively low cost and thermal design power. I even placed a preorder as UDOO was accepting preorders for delivery in July 2019. But July 2019 came and went and there was no solid date for delivery, so I canceled my preorder and started looking again. And I am glad I did.
I ended up coming across the EGLOBAL s200 Mini PC. This computer is manufactured and distributed by a Chinese company, so you would have to order it from a site like AliExpress. What is interesting about this computer is that you can choose from four CPU options, including a Core i9 8950HK or a Xeon E-2176M, both of which provide 12 threads and have benchmarks more than twice as fast that of the UDOO Bolt’s AMD V1605B. Furthermore, the computer can accept up to 64 GB of DDR4 2666 MHz RAM, it has two M.2 NVMe SSD slots, ability to add a 2.5″ SATA3 SSD, and comes in an extremely small form factor. Best of all, this S200 computer can be purchased at about the same price as the UDOO Bolt. So I bought four. One thing I will note on sourcing these computers is that they seem to be widely side by different resellers on AliExpress and even on Amazon (affiliate link) though its rebranded there, and can be found with a wide range of prices. At the time of this writing, I found the Topton Computer Store (affiliate link) as having the cheapest prices.
I bought the computers without any RAM or SSD, allowing me to select precisely which ones I felt would be best for my cluster build. Though this computer can accept up to 64 GB of RAM, the manufacture doesn’t offer a prebuilt version with that much RAM installed. The computer has two SO-DIMM sockets, so I purchased two 32 GB RAM modules for each computer. For storage I decided to start with a 1 TB SSB. Given the computer’s expansion abilities, I could add more later if I desired. The SSD storage that the computer manufactured offers was slow, and relatively expensive. The EGLOBAL S200 computer is using PCIe Gen 3, so this meant I couldn’t use the fastest of PCIe Gen 4 SSDs, but there are plenty fast SSDs for Gen 3. I decided to go with the Sabrent Rocket line of M.2 NVMe SSD, chiefly due to it’s very high speed and relatively low cost. What I ended up purchasing for each node is as follows (note that all product links are affiliate links):
|EGLOBAL S200 Computer|
with Xeon E-2176M CPU
|32 GB RAM Module|
Samsung DDR4 2666MHz 260 Pin SODIMM, 1.2V
Sabrent Rocket NVMe PCIe M.2 2280 SSD
This puts my total per node cost at $823.21. For four nodes, my total cluster cost already exceeds my $3000 price target, but given the number of threads this cluster will have and the 256 GB RAM pool, I’m very happy with the this cluster’s specs. You could cut costs by going with a cheaper CPU option for the S200, or by using less RAM or a smaller SSD. I would note that the price for the computer seems to fluctuate a lot on AliExpress. Your price might vary.
In order to complete the cluster, I needed to purchase some networking hardware. I plan to use the same network design as my last cluster, so all I really needed to get is a switch, cables to connect the node to it, and a USB ethernet transceiver to connect the cluster to the outside world. Altogether, my cluster build looked like this:
|EGLOBAL S200 Computer Node|
built out as described above
|5 Port Ethernet Switch||Amazon||$14.99||1|
|1 ft Cat 6 cables||Amazon||$15.99||1|
|USB 3 Type C Ethernet Adapter||Amazon||$15.99||1|
I used an existing ethernet cable I had to connect the cluster to my home network, but you don’t have one, but sure to pick that up too. This brings my total cluster cost to $3359.80. I feel that’s not bad for a cluster with 48 thread, 256 GB of RAM, and 4 TB of disk space.
Next up will be the physical build out and initial setup of the cluster.