Sunday, May 8, 2016

Install Apache spark in standalone mode on Ubuntu

In this post explain about detail steps to setup Apache spark spark-1.6.1 in ubuntu 14.04

Steps:
  1. Install Java
  2. Install Scala
  3. Install Git
  4. build spark
Install Java

For running spark on a machine, need to install java.Use following command to easily  install the java in Ubuntu machine.

Check the Java version,to convince it has been installed successfully.
It shows installed java version as following
java version "1.7.0_79"
Java(TM) SE Runtime Environment (build 1.7.0_79-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode)

Install Scala
In next step is install Scala, follow the following instructions to set up Scala. First download the Scala from here

Copy downloaded file to some location for example /urs/local/src, untar the file and set path variable, 

And add following in the end of the file
restart bashrc
To check the Scala is installed successfully
It shows installed Scala version Scala code runner version 2.10.4 -- Copyright 2002-2013, LAMP/EPFL

Then type just scala to goes interactive shell.

Install Git 
Install git since spark build depends on git

Build Spark 
Download the  Spark distribution from here



Building

SBT(Simple Build Tool) is used for building Spark, which is bundled with it. To compile the code
Building take some time. After successfully packing you can test a sample program




Then you get the output as Pi is roughly 3.14634. Spark is ready to fire