Reading Hive table from Spark ~ Data Engineering

We will read from a hive table with the help of Spark dataframe. Here we will be using Spark version 2.1.1 and hive version 1.2.

In Hive we already have a table called as Merged_data_all in video_analytics schema.

In Spark-Shell:

For this we need to sqlContext object to write the sql queries. Below is the statement for this:

val sqlContext = new org.apache.spark.sql.SQLContext(sc)

Then directly we can write with in sqlContext as :

sqlContext.sql("Select * from video_analytics.Merged_data_all limit 10").show().

The sbt dependencies for this are :

name := "hive_table_test"

version := "0.1"

scalaVersion := "2.11.11"

// https://mvnrepository.com/artifact/org.apache.spark/spark-core
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.2.0"

// https://mvnrepository.com/artifact/org.apache.spark/spark-sql
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.2.0"

Please check the below github link for code.

Hive-Demo-From-Spark

Reading Hive table from Spark

Related Post