Form a jdbc connection string
val jdbcHostname = "10.21.31.41"
val jdbcPort = 1521
val jdbcDatabase = "dbname"
val jdbcUsername = "user1"
val jdbcPassword = password1
val driverClass = "oracle.jdbc.driver.OracleDriver"
val jdbcUrl = s"jdbc:oracle:thin:@$jdbcHostname:$jdbcPort/$jdbcDatabase"
Set Up connection properties
import java.util.Properties
val connectionProperties = new Properties()
connectionProperties.put("user", s"$jdbcUsername")
connectionProperties.put("password", s"$jdbcPassword")
connectionProperties.setProperty("Driver", driverClass)
connectionProperties.put("fetchsize", "100000")
Form a query and create a spark data frame
val query = "(select * from table1)"
val df = spark.read.jdbc(jdbcUrl, query, connectionProperties)
Note:
val jdbcHostname = "10.21.31.41"
val jdbcPort = 1521
val jdbcDatabase = "dbname"
val jdbcUsername = "user1"
val jdbcPassword = password1
val driverClass = "oracle.jdbc.driver.OracleDriver"
val jdbcUrl = s"jdbc:oracle:thin:@$jdbcHostname:$jdbcPort/$jdbcDatabase"
Set Up connection properties
import java.util.Properties
val connectionProperties = new Properties()
connectionProperties.put("user", s"$jdbcUsername")
connectionProperties.put("password", s"$jdbcPassword")
connectionProperties.setProperty("Driver", driverClass)
connectionProperties.put("fetchsize", "100000")
Form a query and create a spark data frame
val query = "(select * from table1)"
val df = spark.read.jdbc(jdbcUrl, query, connectionProperties)
Note:
- Ensure that the oracle jar is included in the spark-shell or spark-submit commands to invoke the driverClass to make a connection
- More details on the fetchsize parameter which improves the data retrieval performance https://docs.oracle.com/cd/A87860_01/doc/java.817/a83724/resltse5.htm
