Create Hive database with Jupyter Notebook

Diogo Rodrigues 1 Reputation point
2021-10-25T17:08:54.737+00:00

I have a HDInsight cluster and I want to create hive databases and tables (and load data into them) using Jupyter Notebook.

Can anyone explain how can I do that? Is there any type of example notebooks explaining that?

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
210 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. PRADEEPCHEEKATLA-MSFT 89,466 Reputation points Microsoft Employee
    2021-10-26T09:14:33.35+00:00

    Hello @Diogo Rodrigues ,

    Welcome to the Microsoft Q&A platform.

    Azure HDInsight Spark clusters provide kernels that you can use with the Jupyter Notebook on Apache Spark for testing your applications. A kernel is a program that runs and interprets your code. The three kernels are:

    • PySpark - for applications written in Python2.
    • PySpark3 - for applications written in Python3.
    • Spark - for applications written in Scala.

    Once you create the Azure HDInsight Spark cluster.

    • From the Azure portal, select your Spark cluster. See List and show clusters for the instructions. The Overview view opens.
    • From the Overview view, in the Cluster dashboards box, select Jupyter Notebook. If prompted, enter the admin credentials for the cluster.

    143771-image.png

    • Select New, and then select either Pyspark, PySpark3, or Spark to create a notebook. Use the Spark kernel for Scala applications, PySpark kernel for Python2 applications, and PySpark3 kernel for Python3 applications.

    143729-image.png

    Now you can create hive databases and tables using Jupyter Notebook.

    • Create a database of your choice

    143734-image.png

    • Create a table of your choice

    143763-image.png

    For more details refer to Kernels for Jupyter Notebook on Apache Spark clusters in Azure HDInsight.

    And also please refer to the Hive manual for details on how to create tables and load/insert data into the tables.

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    1 person found this answer helpful.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.