2,070 questions with Azure Databricks tags

Sort by: Updated
1 answer One of the answers was accepted by the question author.

Best way to connect to a databricks datamart for further exploration in PowerBI

Situation: Databricks is used as enterprise data platform. In reality local teams are moving forward with a different speed than the global organization is. To support this but still keeping control on the "golden source" we're setting up an…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,070 questions
Not Monitored
Not Monitored
Tag not monitored by Microsoft.
37,664 questions
asked 2021-01-14T13:44:27.313+00:00
Peter Verrykt 21 Reputation points
accepted 2021-01-17T13:10:39.733+00:00
Peter Verrykt 21 Reputation points
1 answer

How do I orchestrate ML model retraining periodically?

I have to retrain every month or so a PyTorch Model trained on data obtained from processing tables sitting in Azure Data Lake Storage gen 1. So far, I have the following building blocks: A Databricks notebook that does the ETL job of…

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,714 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,070 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,128 questions
asked 2021-01-08T23:32:08.983+00:00
Davide Fiocco 31 Reputation points
answered 2021-01-11T08:57:22.71+00:00
Ramr-msft 17,731 Reputation points
1 answer One of the answers was accepted by the question author.

What is the use of oldest-time-to-consider param in Jobs API?

Hi, I haven't found any documentation around what this value "oldest-time-to-consider": "1457570074236" is used for in the Databricks Job API. Can someone please direct me to the documentation that talks about the significance of…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,070 questions
asked 2020-12-15T22:39:08.36+00:00
VishR 21 Reputation points
commented 2021-01-05T16:22:04.547+00:00
VishR 21 Reputation points
1 answer One of the answers was accepted by the question author.

Can you write multiple streaming queries(same schema, different input sources) into same Azure storage without overwriting?

Hi, I have to this requirement to write multiple streaming queries(same schema, different input sources) into same Azure blob delta lake gen 3 storage without overwriting. I need the data to co-exist in the same write directory, say like in 'append'…

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,424 questions
Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
2,615 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,070 questions
asked 2020-12-30T20:18:54.43+00:00
Mayuri Kadam 81 Reputation points Microsoft Employee
accepted 2021-01-04T16:59:00.253+00:00
Mayuri Kadam 81 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

Is there any way to do Custom dynamic mapping of different number of columns in dataflow or any other options to achieve this?

My source (CSV file in ADLS) has header record(3 columns) , detail records(5 columns) and trailer record(2 columns) . The header record has less number of columns than the detail records. When I try to convert this csv file to parquet, i m getting the…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,070 questions
asked 2020-12-30T07:35:06.643+00:00
accepted 2021-01-04T07:01:50.97+00:00
1 answer

MLOps using Azure Databricks & Azure ML - question on data prep for model inference and retraining.

I am using this blog (https://databricks.com/blog/2020/10/13/using-mlops-with-mlflow-and-azure.html) to set-up MLOps using Azure Databricks & Azure ML. As mentioned in the blog, we deploy MLflow model into an Azure ML environment using the built in…

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,714 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,070 questions
asked 2020-12-14T15:09:41.913+00:00
Kiran Purushotham 11 Reputation points
commented 2021-01-02T12:20:39.327+00:00
GermanM 1 Reputation point
1 answer

How to use ARM template to restrict publicBlobAccess to managed Databricks storage account

In our organisation we are required to disable publicBlobAccess and enable TLS1_2 as minimum version on all storage accounts. Preferably we also use StorageV2 type instead of BlobStorage. When we create Databricks workspace, using ARM template, the…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,070 questions
asked 2020-12-04T07:52:20.833+00:00
Leerdam van, J (Jean-Marc) 16 Reputation points
commented 2020-12-24T08:18:12.133+00:00
Janne Kujanpää 216 Reputation points
1 answer

How to query 3rd party Azure DataLake Gen2 and only store the results

First, what I am trying to do is I want to query and aggregate raw JSON files stored in a 3rd party's Azure Data Lake (Gen2) and store those aggregates in my own data lake or relation db. I do not want to physically copy all of those raw JSON files…

Azure SQL Database
Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,424 questions
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,669 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,070 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,128 questions
asked 2020-12-15T16:14:02.687+00:00
JasonW-5564 161 Reputation points
commented 2020-12-22T12:18:09.123+00:00
HarithaMaddi-MSFT 10,136 Reputation points
1 answer

Linkedin connectivity with Azure

Hi, Is there a way to extract existing content from a Linkedin company page using Azure? Thanks

Azure Logic Apps
Azure Logic Apps
An Azure service that automates the access and use of data across clouds without writing code.
2,984 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,070 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,128 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
2,627 questions
asked 2020-12-16T07:47:10.393+00:00
Darshika Rajendran 1 Reputation point
commented 2020-12-22T12:17:25.62+00:00
HarithaMaddi-MSFT 10,136 Reputation points
0 answers

Azure repo - Can't create or push tag

I host a project on Azure DevOps Repositories and I would like to create several git tag for project version release. I'm a GitKraken user so I added a new tag and push it to origin but this error occurs: So I checked my permission on Azure…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,070 questions
asked 2020-12-11T13:51:52.037+00:00
MATTONE THOMAS 1 Reputation point
commented 2020-12-19T11:42:47.243+00:00
Jaliya Udagedara 2,821 Reputation points MVP
1 answer

The file updated in databricks is not reflecting in Azure Portal

I have a databrick workspace, where I read a file from Azure blob storage, updated a file and uploaded it in another Azure blob storage space. Now when I access that file through any databricks workspace, I can see the file and access the content of it.…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,070 questions
asked 2020-12-11T03:08:59.943+00:00
Amey Pimpley 1 Reputation point
commented 2020-12-18T05:42:41.683+00:00
PRADEEPCHEEKATLA-MSFT 85,346 Reputation points Microsoft Employee
1 answer

I HAVE ERROR WHEN AZURE DATABRICKS WRITE COSMOS DB

I am trying to write a spark dataframe from azure databricks to a cosmos db database and I have this error Py4JJavaError: An error occurred while calling o753.save. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,070 questions
Azure Cosmos DB
Azure Cosmos DB
An Azure NoSQL database service for app development.
1,534 questions
asked 2020-12-05T13:57:36.433+00:00
Williams Gerard Gamboa Anchante 1 Reputation point
commented 2020-12-17T22:01:17.41+00:00
HimanshuSinha-msft 19,381 Reputation points Microsoft Employee
0 answers

Deletion of managed MLflow Artifacts

When using Workspace experiments in Azure Databricks with the default managed MLflow artifact location dbfs:/databricks/mlflow-tracking configured, we see the following message when deleting an MLflow experiment run: Deleted runs are restorable for…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,070 questions
asked 2020-11-20T08:50:37.143+00:00
Christoph Stumpf 1 Reputation point
commented 2020-12-17T19:35:10.29+00:00
HimanshuSinha-msft 19,381 Reputation points Microsoft Employee
1 answer

Unable to delete folder in databricks "DBFS://

Hi, When I run the command %fs ls '/' in the results I see a folder path as "dbfs://" and name as "/". and tried to run the command in the notebook %fs ls '//' I get the java error and even not able to delete the folder. Please…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,070 questions
asked 2020-12-17T04:16:28.657+00:00
prado 1 Reputation point
commented 2020-12-17T05:38:42.44+00:00
prado 1 Reputation point
1 answer One of the answers was accepted by the question author.

using streaming batch for multiple operations

I am new to spark and DataBricks and was trying to look for a solution where I can utilize a batch from a eventhub stream to accomplish multiple business logic but could not find any guidance. Stream I get from EventHub is a CDC stream from multiple…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,070 questions
asked 2020-12-15T04:00:40.087+00:00
Rohit Sapru 41 Reputation points
commented 2020-12-16T05:15:45.41+00:00
PRADEEPCHEEKATLA-MSFT 85,346 Reputation points Microsoft Employee
1 answer

Databricks readstream writestream to Azure Synapse

I am having an issue on writing stream to Azure synapse with the following error . let's have a look and see if there is idea ?

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,669 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,070 questions
asked 2020-12-10T06:08:39.147+00:00
sakuraime 2,321 Reputation points
commented 2020-12-15T06:19:23.62+00:00
PRADEEPCHEEKATLA-MSFT 85,346 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

How to set security permissions to Databases in databricks through Notebooks

We are stuck on the way to set security permissions to Databases by using Notebooks %sql. At first, let me explain our situations and settings. We run the following code on Notebooks: %sql CREATE DATABASE X ; GRANT USAGE ON DATABASE X TO…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,070 questions
Microsoft Entra ID
Microsoft Entra ID
A Microsoft Entra identity service that provides identity management and access control capabilities. Replaces Azure Active Directory.
20,532 questions
asked 2020-12-11T02:21:53.517+00:00
Asuka 21 Reputation points
accepted 2020-12-14T05:45:20.553+00:00
Asuka 21 Reputation points
1 answer One of the answers was accepted by the question author.

Databricks Pyspark exception handling best practices

Hi, In the current development of pyspark notebooks on Databricks, I typically use the python specific exception blocks to handle different situations that may arise. I am wondering if there are any best practices/recommendations or patterns to handle…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,070 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,128 questions
asked 2020-12-08T16:10:17.32+00:00
Satya D 141 Reputation points
accepted 2020-12-11T13:50:16.2+00:00
Satya D 141 Reputation points
1 answer

Azure Databricks Cluster

Hi, Created new cluster in Databricks (QA environment) Afte that when i try to click on data tab inroder to create database getting below error checked that cluster is up and running.

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,070 questions
asked 2020-11-25T07:27:39.867+00:00
Vijay Kumar 2,031 Reputation points
commented 2020-12-11T09:29:48.567+00:00
PRADEEPCHEEKATLA-MSFT 85,346 Reputation points Microsoft Employee
2 answers

Quota limit hit on tutorial notebook

I'm attempting to launch a default cluster (min 2, max 8) on the premium trial account in order to run 01-The-Databricks-Environment. I haven't been able to run any operations in the notebook. Receiving this error: Azure error code:…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,070 questions
asked 2020-12-07T20:09:56.14+00:00
Zak Wear 1 Reputation point
commented 2020-12-11T04:53:19.223+00:00
PRADEEPCHEEKATLA-MSFT 85,346 Reputation points Microsoft Employee