Monday, July 29, 2019

Build a Scala FAT Jar Using SBT

Make a directory structure as below

~/sbt_build  - This is the root directory
  - src/main/scala/   - Place the .scala files here
  - target  - This will get created once the jar is built automatically
  - project - This will get created once the jar is built automatically
  - build.sbt - Prepare this file and place it here

Contents of build.sbt

name := "SampleApp"

version := "1.0"
scalaVersion := "2.11.6"
val sparkVersion = "2.2.0"

libraryDependencies += "org.apache.spark" %% "spark-core" % sparkVersion % "provided"
libraryDependencies += "org.apache.spark" %% "spark-sql" % sparkVersion % "provided"
libraryDependencies += "org.apache.spark" %% "spark-hive" % sparkVersion % "provided"
libraryDependencies += "oracle" % "ojdbc6" % "11.1.0.6"

libraryDependencies += "com.cedarsoftware" % "java-util" % "1.8.0"

See above that there is an oracle jar dependency that is added and that is not available from any maven repository. The ojdbc jar has to be placed in the below directory for SBT to automatically add it to the final JAR file. Any Jar that is placed under the lib directory will get added to the final JAR.

~/sbt_build/project/lib

Also, there are few ways to create JAR files using sbt. We will be using "sbt assembly" plugin to create one FAT jar. In order to add the plugin to the sbt place a file by name "plugins.sbt" in the project folder.

~/sbt_build/project/plugins.sbt

Add the below line in the plugins.sbt file
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.10")

At the parent directory run the below 
~/sbt_build$ sbt assembly

Once the build is successful the jar file can be found at 
~/sbt_build/target/scala-2.11/SimpleApp-assembly-1.0.jar ...

Submit the jar on the cluster to run the application 

spark-submit --class SimpleApp--master yarn target/scala-2.11/SimpleApp-assembly-1.0.jar


No comments:

Post a Comment