Getting Started With Spark and Scala
What is Spark? Spark is a distributed cluster computing designed by Apache Software Foundation. It was built on top of Hadoop MapReduce and was designed for fast computation. One of the main features that Spark addresses is its in-memory cluster computing that increases the processing speed of an application. Before you install spark, please make sure you have installed Java and Hadoop in your system. My instructions are targeted for I am using Ubuntu 14.04 LTS. Java Installation Check whether your system has Java installed or not. Please type the following in your command line: $java -version You should get something like below: java version "1.8.0_144" Java(TM) SE Runtime Environment (build 1.8.0_144-b01) Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode) If Java is not installed please refer to this link. Hadoop Installation Verify Hadoop installation with the following command: $hadoop version Output should be some