HiveServer2 JDBC Client on AWS

Connecting to Hive database can be done using Hive CLI or beeline from a command prompt and programmatically using a JDBC client. The following section describes setting up a JDBC client for HiveServer2 on Eclipse.

Download the required Hive 0.13.1 JDBC jar files from Amazon site in the section titled ‘To download JDBC drivers’. Extract ZIP file and you can see all jar files in the extracted-to folder.

Using Eclipse IDE, create a Java project and configure the Build Path to add external JARs downloaded in the above step.

Use the following program to connect to database on Hive and run the program. It connects to Hive database on Hadoop NameNode and displays the results on the Eclipse console.

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;

public class JdbcClient {
	public static void main(String[] args) {

		try {
			Class.forName("com.amazon.hive.jdbc41.HS2Driver");
		} catch (ClassNotFoundException e) {
			e.printStackTrace();
		}

		Connection conn = null;
		ResultSet rs = null;
		try {
			conn = DriverManager.getConnection("jdbc:hive2://<namenode_ipaddress>:10000/default", "hadoop", "");
			Statement stmt = conn.createStatement();
			String addJarQuery = "ADD JAR /path/on/namenode/to/json-serde-1.3-jar-with-dependencies.jar";
			String queryStr = "select * from my_table limit 10";

			stmt.execute(addJarQuery);
			rs = stmt.executeQuery(queryStr);

			while (rs.next()) {
				System.out.println(rs.getString(1));
			}
		} catch (SQLException e) {
			e.printStackTrace();
		} finally {
			try {
				if (conn != null)
					conn.close();

			} catch (SQLException e) {
				e.printStackTrace();
			}
		}

		System.out.println("End of the program");
	}
}

P.S: In the above code, if ‘ADD JAR’ Hive command is not there, you get the following error.

java.lang.ClassNotFoundException Class org.openx.data.jsonserde.JsonSerDe
Advertisements

Drop Hive Partitions with Special Characters

I have a partition for a Hive table created wrongly. See the query below. Hive uses URL-encoded values for special characters.

$ hive -e 'show partitions mytable'
date=$%7Bhiveconf%3AINPUT_DATE}

How to drop it?

It will not work if you use the same value displayed above to drop it, even if Hive says OK.

hive> alter table mytable drop partition (date='$%7Bhiveconf%3AINPUT_DATE}');
OK
Time taken: 0.804 seconds

You know the actual partition when you created. If don’t remember check here to know what is the equivalent value for each encoded character value, and use the actual value to drop it.

hive> alter table mytable drop partition (date='${hiveconf:INPUT_DATE}');