Make a directory structure as below
~/sbt_build - This is the root directory
- src/main/scala/ - Place the .scala files here
- target - This will get created once the jar is built automatically
- project - This will get created once the jar is built automatically
- build.sbt - Prepare this file and place it here
Contents of build.sbt
name := "SampleApp"
version := "1.0"
scalaVersion := "2.11.6"
val sparkVersion = "2.2.0"
libraryDependencies += "org.apache.spark" %% "spark-core" % sparkVersion % "provided"
libraryDependencies += "org.apache.spark" %% "spark-sql" % sparkVersion % "provided"
libraryDependencies += "org.apache.spark" %% "spark-hive" % sparkVersion % "provided"
libraryDependencies += "oracle" % "ojdbc6" % "11.1.0.6"
libraryDependencies += "com.cedarsoftware" % "java-util" % "1.8.0"
See above that there is an oracle jar dependency that is added and that is not available from any maven repository. The ojdbc jar has to be placed in the below directory for SBT to automatically add it to the final JAR file. Any Jar that is placed under the lib directory will get added to the final JAR.
~/sbt_build/project/lib
Also, there are few ways to create JAR files using sbt. We will be using "sbt assembly" plugin to create one FAT jar. In order to add the plugin to the sbt place a file by name "plugins.sbt" in the project folder.
~/sbt_build/project/plugins.sbt
Add the below line in the plugins.sbt file
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.10")
At the parent directory run the below
~/sbt_build$ sbt assembly
Once the build is successful the jar file can be found at
~/sbt_build/target/scala-2.11/SimpleApp-assembly-1.0.jar ...
Submit the jar on the cluster to run the application
spark-submit --class SimpleApp--master yarn target/scala-2.11/SimpleApp-assembly-1.0.jar
~/sbt_build - This is the root directory
- src/main/scala/ - Place the .scala files here
- target - This will get created once the jar is built automatically
- project - This will get created once the jar is built automatically
- build.sbt - Prepare this file and place it here
Contents of build.sbt
name := "SampleApp"
version := "1.0"
scalaVersion := "2.11.6"
val sparkVersion = "2.2.0"
libraryDependencies += "org.apache.spark" %% "spark-core" % sparkVersion % "provided"
libraryDependencies += "org.apache.spark" %% "spark-sql" % sparkVersion % "provided"
libraryDependencies += "org.apache.spark" %% "spark-hive" % sparkVersion % "provided"
libraryDependencies += "oracle" % "ojdbc6" % "11.1.0.6"
libraryDependencies += "com.cedarsoftware" % "java-util" % "1.8.0"
See above that there is an oracle jar dependency that is added and that is not available from any maven repository. The ojdbc jar has to be placed in the below directory for SBT to automatically add it to the final JAR file. Any Jar that is placed under the lib directory will get added to the final JAR.
~/sbt_build/project/lib
Also, there are few ways to create JAR files using sbt. We will be using "sbt assembly" plugin to create one FAT jar. In order to add the plugin to the sbt place a file by name "plugins.sbt" in the project folder.
~/sbt_build/project/plugins.sbt
Add the below line in the plugins.sbt file
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.10")
At the parent directory run the below
~/sbt_build$ sbt assembly
Once the build is successful the jar file can be found at
~/sbt_build/target/scala-2.11/SimpleApp-assembly-1.0.jar ...
Submit the jar on the cluster to run the application
spark-submit --class SimpleApp--master yarn target/scala-2.11/SimpleApp-assembly-1.0.jar
