204 questions with Azure HDInsight tags

Sort by: Updated
1 answer

Collision on non-unique "headnodehost" hostname across HDInsight clusters

The context: Let's suppose I have multiple HdInsight4.0 clusters. Also suppose that I would like to access the Hadoop services eg jobhistory server running inside these clusters. Let's suppose I get the corresponding jobhistory address from each…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
204 questions
asked 2021-03-03T10:10:10.343+00:00
Anonymous
answered 2021-03-29T22:13:24.28+00:00
Anonymous
1 answer One of the answers was accepted by the question author.

Azure HDInsight HBase - Create non-admin SSH user

I would like to create a standard user account on all HDInsight nodes that is a non-admin account. This will be used to login to the nodes and run some basic commands. There is an admin account created by default on all nodes, but is there a way for…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
204 questions
asked 2021-03-19T15:22:31.07+00:00
Nathan 21 Reputation points
accepted 2021-03-25T20:12:22.43+00:00
Nathan 21 Reputation points
1 answer

HWC and Hive (HDinsight): reserved keyword as a column name

with an attempt to save the dataframe for a table which has a column name 'timestamp' and SaveMode.Overwrite, the following exception occurs: org.apache.hadoop.hive.ql.parse.ParseException:line 1:47 cannot recognize input near 'timestamp' 'timestamp'…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
204 questions
asked 2021-03-15T15:30:26.08+00:00
Core Velocity 1 Reputation point
commented 2021-03-23T04:21:29.887+00:00
PRADEEPCHEEKATLA-MSFT 84,531 Reputation points Microsoft Employee
1 answer

Setting StatusFolder for HDInsight Spark job triggered through Data Factory

Currently our Spark job runs result in a number of folders with random guid names being created in the root directory of the container we use as our HDInsight cluster storage. This seems to be the folder in the context of which the job runs, it has a…

.NET
.NET
Microsoft Technologies based on the .NET software framework.
3,575 questions
Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
204 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,028 questions
asked 2021-03-11T21:33:49.843+00:00
Shweta Chandramouli 1 Reputation point
commented 2021-03-23T04:20:01.213+00:00
PRADEEPCHEEKATLA-MSFT 84,531 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

Outbound proxy to HDInsight management

Hello, How can we route traffic to HDInsight management through a proxy to avoid opening NSG outbound and also to not use an Azure firewall? https://video2.skills-academy.com/en-us/azure/hdinsight/hdinsight-restrict-outbound-traffic Thank you, …

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
204 questions
asked 2021-03-18T12:00:04.383+00:00
Alex Boata 21 Reputation points
commented 2021-03-22T08:59:10.227+00:00
Alex Boata 21 Reputation points
0 answers

Conda installation in HDHinsight taking too long to run and time out after several hours - what could be the reason?

Package installation with anaconda in HDHinsight cluster taking too long to run and time out after several hours - what could be the reason?

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
204 questions
asked 2021-03-15T16:52:42.13+00:00
Alphonse Okossi 141 Reputation points
commented 2021-03-19T10:56:08.017+00:00
PRADEEPCHEEKATLA-MSFT 84,531 Reputation points Microsoft Employee
1 answer

Looking for HDInsight Script Action sample scripts for installing python package in PySpark 3

Looking for HDInsight Script Action sample scripts for installing python package in PySpark 3

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
204 questions
asked 2021-03-09T19:07:45.02+00:00
Alphonse Okossi 141 Reputation points
commented 2021-03-16T10:46:47.973+00:00
PRADEEPCHEEKATLA-MSFT 84,531 Reputation points Microsoft Employee
0 answers

AZURE HDInsight VM list

On this page, they mentioned what minimum server required. But didn't mention how many VM required. For example minimum how many head nodes required and how many worker nodes required for HBASE? Please advice

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
204 questions
asked 2021-03-01T13:24:53.503+00:00
maria franklin 1 Reputation point
commented 2021-03-10T23:55:51.433+00:00
Saurabh Sharma 23,781 Reputation points Microsoft Employee
1 answer

Unable to write dataframe into hive table

Team , We are using Hive Interactive cluster and Spark cluster . We have done the LLAP related configuration on Spark cluster . Now both the cluster are interacting each other without any issues. I tried to load dataset (adl gen2 filesystem) into hive…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
204 questions
asked 2021-03-03T08:11:57.007+00:00
commented 2021-03-04T17:52:12.913+00:00
MartinJaffer-MSFT 26,051 Reputation points
1 answer One of the answers was accepted by the question author.

Would like to request increase of quota for HDInsight West Europe

Would like to request increase of quota for HDInsight West Europe

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
204 questions
asked 2021-03-01T15:47:55.907+00:00
Alphonse Okossi 141 Reputation points
accepted 2021-03-03T14:12:59.63+00:00
Alphonse Okossi 141 Reputation points
2 answers

Unable to create HDInsight cluster using azure powershell

Unable to Create a HDInsight cluster using the below link https://video2.skills-academy.com/en-in/azure/hdinsight/hdinsight-administer-use-powershell New-AzHDInsightCluster: Line | 10 | -DefaultStorageAccountName…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
204 questions
asked 2021-02-10T22:00:38.47+00:00
Kareem Abdul 1 Reputation point
answered 2021-03-02T21:24:02.203+00:00
Kareem Abdul 1 Reputation point
1 answer

AZURE HDInsight.

Hi, I am new to AZURE HDInsight. We have planned to create a new project in HDInsight. It's required Hadoop, BI, Analytics, IoT, Migration,Ambari ...etc Minimum how many VM'S required for this project. Please Advice

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
204 questions
asked 2021-03-01T12:24:36.863+00:00
maria franklin 1 Reputation point
commented 2021-03-02T01:24:01.213+00:00
Saurabh Sharma 23,781 Reputation points Microsoft Employee
0 answers

Deploy an edge node

Hi everyone, it is possible deploy an edge node with specific kernel on an existing HDInsight cluster? Best Regards, Simone

Microsoft Edge
Microsoft Edge
A Microsoft cross-platform web browser that provides privacy, learning, and accessibility tools.
2,225 questions
Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
204 questions
asked 2021-02-25T16:11:05.94+00:00
Perigli, Simone 1 Reputation point
commented 2021-03-01T17:29:43.6+00:00
KranthiPakala-MSFT 46,437 Reputation points Microsoft Employee
2 answers One of the answers was accepted by the question author.

Spark cluster to read Hive on differnt HDI cluster

I have two different HDI clusters say Cluster A , Cluster B . One HDInsight (Cluster A) is spark cluster and another one(Cluster B) is provisioned with hive. I need to run spark processing in Cluster A and need to connect to hive which is in Cluster…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
204 questions
asked 2021-02-10T17:09:14.947+00:00
answered 2021-02-26T10:57:43.397+00:00
1 answer One of the answers was accepted by the question author.

What will be the max throughput of Kafka rest proxy enabled on HDINSIGHT Kafka cluster

I would like to set up a Kafka cluster, which needs an ingestion (producer) throughput of around 150MB/Second. In order to achieve that in my local setup I am needing 4 rest proxy servers of 8 CPUs each. However, when I am trying a create a Kafka cluster…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
204 questions
asked 2021-02-16T09:06:50.3+00:00
Sai Birada 21 Reputation points
commented 2021-02-25T17:20:15.3+00:00
KranthiPakala-MSFT 46,437 Reputation points Microsoft Employee
2 answers One of the answers was accepted by the question author.

HDFS FileSystem utility not supported for multiple container/storage account

Hi Team, I am facing a problem on renaming file location(path) from one container to another container using rename function from Hadoop(HDI) Filesystem utility(https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/fs/FileSystem.html). Get to…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
204 questions
asked 2021-02-10T21:28:00.16+00:00
Senapathy, Kumaraswamy 21 Reputation points
commented 2021-02-18T23:25:46.963+00:00
KranthiPakala-MSFT 46,437 Reputation points Microsoft Employee
0 answers

nodes are created with anonomous subscription id

Hi there, When I create a hdinsight clusters, the cluster is created with the subscription id that I am in but every time the nodes are forming with different anonymous id and when I try to access the nodes for example zookeeper it is throwing 401…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
204 questions
asked 2021-02-02T23:24:21.263+00:00
superhero112' 1 Reputation point
commented 2021-02-17T03:56:23.693+00:00
PRADEEPCHEEKATLA-MSFT 84,531 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

Nodes Details in Azure Hbase

Which node is better to process 1.5 millions records daily where total size of storage is 10tb. Its there any way to auto-scale Hbase cluster as there will be only 1 to 2 transaction a day but data size will be large.

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
204 questions
asked 2021-01-27T07:15:15.297+00:00
Sarvesh Pandey 141 Reputation points
accepted 2021-02-02T14:54:23.337+00:00
Sarvesh Pandey 141 Reputation points
0 answers

Getting error when using SHC to query HBase data in Spark

Hello, I am trying to read data stored in HBase table from Spark cluster using SHC (reference link: https://video2.skills-academy.com/en-gb/azure/hdinsight/hdinsight-using-spark-query-hbase) I followed all the steps as is, but when running the command…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
204 questions
asked 2021-01-20T11:16:09.32+00:00
Paarth Gupta 1 Reputation point
commented 2021-01-28T11:03:55.077+00:00
PRADEEPCHEEKATLA-MSFT 84,531 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

Fetching the Spark Yarn log from Azure HDInsight

Hi Team, Currently through LIVY I am Posting/submitting spark jobs to Azure HDInsight Cluster. After job finishes I am looking into Spark History Server for yarn logs. Livy log for each spark job is not providing yarn logs. Can we Fetch the Spark…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
204 questions
asked 2020-12-08T14:19:51.793+00:00
Samrat 136 Reputation points
commented 2020-12-10T08:24:30.237+00:00
PRADEEPCHEEKATLA-MSFT 84,531 Reputation points Microsoft Employee