Azure Databricks

1 answer

Spark jobs not running in a notebook

I am currently running the "1.Reading Data - CSV" notebook from the "Read and write data in Azure Databricks" module on Microsoft learn. When I tried to run the cell "# A reference to our tab-separated-file", the Spark jobs…

asked

Sasha Yang 1

commented

PRADEEPCHEEKATLA-MSFT 88,716 Microsoft Employee

2 answers

Upsert data in to SQL from delta table

Hello Team, we have scenario where we have to get the data from lake , process it and then store in SQL database . This is what we are doing Read the entity from Lake Store that in delta table _staging Do merge between delta table and…

asked

Rocky420 21

commented

ChiragMishra-MSFT 956

2 answers

Issue in accessing delta table in datalake gen2 storage account with databricks cluster (latest stable version)

Recently, i am encountering an issue in the databricks cluster where it could not accessing the delta table (unmanaged delta table) which parquet files are stored in the azure datalake gen2 storage account. The issue is it could not read/update from the…

asked

Keat Ooi 31

commented

KranthiPakala-MSFT 46,462 Microsoft Employee

1 answer

Install third party libraries in Azure Databricks

Hello, I am trying to install a library "pythonnet" in Azure Databricks. I tried installing it through PyPI, through Python Wheel option and also the JAR option. None of these works for me. I need to connect databricks notebook to Azure…

asked

Shivani Vyas 21

commented

MartinJaffer-MSFT 26,081

0 answers

I/O operations with Azure Databricks REST Jobs API

I have experienced problems with the delivery of arguments via Jobs API. I've outlined the experienced problems in details on Stack Overflow: https://stackoverflow.com/questions/62758094/i-o-operations-with-azure-databricks-rest-jobs-api I would…

asked

Galas, Michal 1

commented

HimanshuSinha-msft 19,471 Microsoft Employee

1 answer

Machine Learning Model Deployment

I am new to ML model and am researching using Azure Databricks and MLFlow to train a model. My question is once the model is created, is there a way to host the model that can be downloaded and inferenced remotely ? I am looking for options other than…

asked

Mahesh Sivan 1

commented

romungi-MSFT 44,771 Microsoft Employee

1 answer

Azure Web Application with computationally intenstive tasks in Dask and Tensorflow

Hello, I'm developing a data analysis tool for the processing of data from Hydrogen-Deuterium exchange mass spectrometry. We would like to accompany our publication with a deployment of the code on Microsoft Azure so that other researchers can quickly…

asked

Jochem Smit 1

answered

brtrach-MSFT 16,121 Microsoft Employee

0 answers

Spark Connector in ADF

Hi, I have created a spark connector to connect to azure data bricks. In copy activity source is spark connector and sink is Azure SQL DB. In spark Connector query, CreatedDate is being converted to String and throwing error where as it is timestamp…

asked

Mounica 1

commented

HimanshuSinha-msft 19,471 Microsoft Employee

0 answers

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 7.0 failed 4 times, most recent failure: java.lang.NoClassDefFoundError: Could not initialize class

Hi, I am getting this error despite defining the class. When I execute the notebook first time it works fine but when I execute the same notebook without code change it started throwing this error. As per the error class not defined but trust me class…

asked

Rajaniesh Kaushikk 476

commented

KranthiPakala-MSFT 46,462 Microsoft Employee

0 answers

data bricks scala : data frame column endoing from UTF 8 to windows 1252

HI I am working with data bricks where i have the data in parque and i am generating smaller files out of it , i have a column in this which is string and it has different characters and i have to encode this string value to windows 1252 or windows…

asked

ManojMathe 1

commented

HimanshuSinha-msft 19,471 Microsoft Employee

1 answer

Third party Python package installed on Databricks cluster gives different results than other Python stacks

We get a Python package developed by a third party. The package implements a standard mathematical model, no machine learning, no randomization. The model turned out to return incorrect results when installed on a Databricks cluster. We tried different…

asked

Hans Geurtsen 1

commented

Hans Geurtsen 1

1 answer

Databricks Notebook Activity parameter problem

I feel this is a bug but not sure if it is with ADF or Databricks. I am running a notebook using ADF notebook activity. My notebook has a widget for which I pass the value from ADF. As I need to manually enter the parameter name while configuring…

asked

TDPPNR 6

commented

TDPPNR 6

2 answers

Spark SQL How to get the 5th column from the Spark SQL Query

Hi, I have a headerless file which I am reading in the spark.read to create a data frame now I want to get the value of the 5th column from the file.File is comma seperated. How to achieve it. I know it is possible in the T-SQL but not sure how to…

asked

Rajaniesh Kaushikk 476

commented

Rajaniesh Kaushikk 476

2 answers

SparkException: Job aborted due to stage failure: Task not serializable: java.io.NotSerializableException:

Hi, I am running this code but this is throwing this error: SparkException: Job aborted due to stage failure: Task not serializable: java.io.NotSerializableException:

asked

Rajaniesh Kaushikk 476

accepted

Rajaniesh Kaushikk 476

0 answers

Azure Databricks - Split column based on special characters in Databricks

I have a column in my csv file that possibly has value in below formats. "Q1_1__Value_-_10_counts" "Value_10_counts" "Q1_1__1__value_yes" This has to be split as below respectively "Value_-_10_counts" …

asked

Jothi 11

commented

HimanshuSinha-msft 19,471 Microsoft Employee

1 answer

Azure IoT - Query Data from IoT Files

Hello, I am using Azure (Azure Databricks, IoT Hub) to stream unstructured data from IoT devices (i.e. wind turbine), in the form of thousands of files with millions of data captured over a period of 10 years. How do I extract a variety of metadata…

asked

Sarosh Niazi 21

accepted

Sarosh Niazi 21

2 answers

File(filePath).exists does not work in Azure databricks

Hi, How to find if file exists in a path in the data lake? Regards Rajaniesh

asked

Rajaniesh Kaushikk 476

answered

Rajaniesh Kaushikk 476

2 answers

Accessing dataframe created in Scala from Python command

Is there a way to create a Spark dataframe in Scala command, and then access it in Python, without explicitly writing it to disk and re-reading? In Databricks I can do in Scala dfFoo.createOrReplaceTempView("temp_df_foo") and it then in…

asked

Dimitri B 66

accepted

Dimitri B 66

1 answer

Standard Configuration Conponents of the Azure Datacricks

Hello, Could you please tell me standard configuration components of the Azure Databricks. What are the Azure components (storage?) required for the configuration of the Azure Databricks? Thank you. Sincerely, Kenjiro Majima

asked

Kenjiro Majima 21

commented

Kenjiro Majima 21

Filter

Content

2,162 questions with Azure Databricks tags

Spark jobs not running in a notebook

Upsert data in to SQL from delta table

Issue in accessing delta table in datalake gen2 storage account with databricks cluster (latest stable version)

Install third party libraries in Azure Databricks

I/O operations with Azure Databricks REST Jobs API

Machine Learning Model Deployment

Azure Web Application with computationally intenstive tasks in Dask and Tensorflow

Spark Connector in ADF

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 7.0 failed 4 times, most recent failure: java.lang.NoClassDefFoundError: Could not initialize class

data bricks scala : data frame column endoing from UTF 8 to windows 1252

Third party Python package installed on Databricks cluster gives different results than other Python stacks

Databricks Notebook Activity parameter problem

Spark SQL How to get the 5th column from the Spark SQL Query

SparkException: Job aborted due to stage failure: Task not serializable: java.io.NotSerializableException:

Azure Databricks - Split column based on special characters in Databricks

More convenient service to read avro files from Azure Data Lake Gen2

Azure IoT - Query Data from IoT Files

File(filePath).exists does not work in Azure databricks

Accessing dataframe created in Scala from Python command

Standard Configuration Conponents of the Azure Datacricks