2,045 questions with Azure Databricks tags

Sort by: Updated
0 answers

how to view a parquet file with no data export headers to csv

i have a parquet file with no data in it. When I a create a notebook and create dataframe, it does not show me the columns. I can see the root folder structure though. The file has nested objects and arrays in its columns and i want to transform it. How…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,045 questions
asked 2020-08-18T02:23:56.807+00:00
reddy 41 Reputation points
commented 2020-08-20T09:10:56.837+00:00
PRADEEPCHEEKATLA-MSFT 84,381 Reputation points Microsoft Employee
1 answer

Databricks notebooks drp

I would like to know what happens to my azure databricks notebooks in case of a region outage: E.g. If my primary zone is CentralUS and this happens to be down: Can I still log in into centralusdatabricks.net and see my notebooks ? If not, I would…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,045 questions
asked 2020-08-13T12:42:34.847+00:00
Anonymous
commented 2020-08-20T09:09:29.76+00:00
PRADEEPCHEEKATLA-MSFT 84,381 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

how to transform all files in a folder and export as seperate files in one notebook

i have a adls gen2 folder with multiple parquet files with same structure. i want to transform all files at once seperately with one script in same notebook and convert each file to csv and write to another folder in adls. how can achieve this? …

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,045 questions
asked 2020-08-18T03:02:22.36+00:00
reddy 41 Reputation points
accepted 2020-08-19T14:36:07.86+00:00
reddy 41 Reputation points
1 answer

Is .NET for Apache Spark in Preview ?

I have read many articles while exploring Azure Data Factory and Azure Databricks. I stumbled upon a article(https://video2.skills-academy.com/en-us/dotnet/spark/how-to-guides/databricks-deploy-methods) where it is mentioned in the notes tha .NET for Apache…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,045 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,015 questions
asked 2020-08-05T13:58:20.25+00:00
nikhil.sharma3 1 Reputation point
commented 2020-08-17T22:15:45.967+00:00
HimanshuSinha-msft 19,376 Reputation points Microsoft Employee
1 answer

Move Delta table data from databricks into azure sql database

Hi Friends, I have one requirement, My source data is in the source(delta table) in data bricks. I want to move source data into the destination (Azure SQL DB). Can you please suggest which is the best one to move the data from source to destination.…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,045 questions
asked 2020-08-03T12:15:30.333+00:00
chandrasekhar munagala 21 Reputation points
commented 2020-08-17T22:11:50.903+00:00
HimanshuSinha-msft 19,376 Reputation points Microsoft Employee
1 answer

Recover table data in Databricks.

Accidentally deleted data from table in prod Databricks. Is there a way to recover the data?

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,045 questions
asked 2020-07-31T23:06:06.913+00:00
naga perni 1 Reputation point
commented 2020-08-17T22:11:10.253+00:00
HimanshuSinha-msft 19,376 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

How to perform distributed combinatorial (N choose K) in Spark .NET?

I have a project where I have a large C(100,20) number of combinations with minor work being done for each combination set. I am using Spark .NET with visual studio as my technology (see setup below):…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,045 questions
asked 2020-08-13T17:20:33.797+00:00
Robert Hogue 96 Reputation points
commented 2020-08-17T03:45:35.32+00:00
PRADEEPCHEEKATLA-MSFT 84,381 Reputation points Microsoft Employee
2 answers

Databricks monitoring using Azure Monitor

Hi Team, I want to monitor azure datababricks metrics and other info like quota, cluster capacity, no of nodes and I wanna put all this information to azure dashboard. How to put the databricks logs to azure monitor without grafana.. Thanks &…

Azure Monitor
Azure Monitor
An Azure service that is used to collect, analyze, and act on telemetry data from Azure and on-premises environments.
2,969 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,045 questions
asked 2020-08-04T09:24:26.763+00:00
Rohit 61 Reputation points
commented 2020-08-14T08:57:47.803+00:00
PRADEEPCHEEKATLA-MSFT 84,381 Reputation points Microsoft Employee
2 answers One of the answers was accepted by the question author.

Transform table results to json in azure databricks

Hi, I am working on a data transformation of sql table results to a json string and save them as json documents. Stuck with how to proceed from here. I can query sale but not being able to create a json string of the table data and eventually save as a…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,045 questions
asked 2020-08-07T18:46:26.077+00:00
Raj D 586 Reputation points
commented 2020-08-12T18:01:08.62+00:00
Raj D 586 Reputation points
2 answers One of the answers was accepted by the question author.

import json payload from a rest api and save as json documents in adls gen2

Hi, I am trying to import json payload from a REST api GET method and save json documents into ADLS Gen2 using azure databricks. GET: https://myapi.com/api/v1/city GET method Output:     [     {"id":2643743,     …

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,409 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,045 questions
Windows Server PowerShell
Windows Server PowerShell
Windows Server: A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.PowerShell: A family of Microsoft task automation and configuration management frameworks consisting of a command-line shell and associated scripting language.
5,446 questions
asked 2020-08-03T17:53:23.913+00:00
Raj D 586 Reputation points
commented 2020-08-11T22:56:26.15+00:00
Raj D 586 Reputation points
1 answer

Databricks 7.0 load to Azure Synapse Analytics fails when using useAzureMSI = true and writeSemantics = copy

When I try to execute a script on Databricks 7.0 to write data to a table in Azure Synapse Analytics, I get an error: Parse error at line: 7, column: 30: Incorrect syntax near ''Managed Service Identity''. I have useAzureMSI option equal to true. …

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,045 questions
asked 2020-06-29T21:51:55.107+00:00
Tom Smith 1 Reputation point
answered 2020-08-11T15:21:53.627+00:00
Nolan Walker 1 Reputation point
2 answers

How to run .NET Spark jobs on Databricks from Azure Data Factory?

In azure data factory, you have a Databricks Acvitiy. This activity supports running python, jar and notebooks. And These notebooks may be written in scala, python, java, and R but not c#/.net. Is there inherent or direct support where I can write my…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,045 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,015 questions
asked 2020-08-05T06:43:23.207+00:00
nikhil.sharma3 1 Reputation point
answered 2020-08-10T15:53:06.977+00:00
HimanshuSinha-msft 19,376 Reputation points Microsoft Employee
1 answer

FileNotFoundException when using abfss to list files in Azure Databricks!

Hi team, I am trying to connect to ADLS2 using hadoop configurations: But when I am trying to use FS commands to list all the files on the path, i am getting File not found exception: import org.apache.hadoop.fs.{FileSystem, Path} …

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,409 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,045 questions
asked 2020-08-04T18:22:18.177+00:00
Goel, Akanksha 66 Reputation points
commented 2020-08-06T06:44:29.74+00:00
PRADEEPCHEEKATLA-MSFT 84,381 Reputation points Microsoft Employee
1 answer

How to pass column list as argument from databricks spark for copy write semantic

Is there a way to pass column list argument for column mapping between spark and synapse table from databricks spark for write semantics as copy as we pass it while running copy command from synapse?

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,621 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,045 questions
asked 2020-07-23T06:54:16.693+00:00
Rishabh 11 Reputation points
commented 2020-08-05T00:32:38.243+00:00
HimanshuSinha-msft 19,376 Reputation points Microsoft Employee
1 answer

Using Spark action in HDInsight Hue

I have created a Spark 2.4 cluster using HDInsights in Azure. I have installed Hue over it using Script actions. Also did the necessary steps for SSH tunneling and connecting to Hue UI. However, on the Hue UI I am able to only see Pig and Hive…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
204 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,045 questions
asked 2020-07-23T09:32:26.923+00:00
Abhijeet Bane 1 Reputation point
commented 2020-08-04T11:17:42.693+00:00
PRADEEPCHEEKATLA-MSFT 84,381 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

Installing gsl on a Databricks cluster

Hello, I am trying to install the GNU Scientific Library on a Databricks high-concurrency cluster. When I run the following shell script: %sh sudo apt-get install libgsl-dev The script keeps running forever, even though the file is about 8Mb…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,045 questions
asked 2020-07-28T06:35:46.18+00:00
Arko Bose 21 Reputation points
commented 2020-07-29T12:41:07.213+00:00
PRADEEPCHEEKATLA-MSFT 84,381 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

Can you help me with listing all the files under mounted blob store?

I have mounted the blob store but I am getting following erorr. Please find attached screenshot.

Azure Storage Accounts
Azure Storage Accounts
Globally unique resources that provide access to data management services and serve as the parent namespace for the services.
2,871 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,045 questions
asked 2020-07-23T09:42:45.263+00:00
Goel, Akanksha 66 Reputation points
commented 2020-07-29T09:36:42.967+00:00
Goel, Akanksha 66 Reputation points
0 answers

DDL opertaions against Azure Cosmos DB Cassandra API from Spark

Hi, I'm having very strange behavior on DDL operations against Cosmos DB w/ Cassandra API from Spark running in Databricks. When creating a keyspace and table as the following, it does execute the create statements with no errors, BUT it really…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,045 questions
Azure Cosmos DB
Azure Cosmos DB
An Azure NoSQL database service for app development.
1,518 questions
asked 2020-07-07T22:29:17.973+00:00
Sungho Hong 1 Reputation point
commented 2020-07-28T05:37:14.93+00:00
Anurag Sharma 17,586 Reputation points
0 answers

ADF Pipeline error

I have had a pipeline running for the past month on a daily basis. Today the pipeline failed saying that the notebook does not exist. I am able to access the said notebook and it is in the proper location that the Pipeline has defined it and I can…

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,045 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,015 questions
asked 2020-07-20T22:28:50.037+00:00
Brittany 1 Reputation point
commented 2020-07-27T08:00:11.84+00:00
HarithaMaddi-MSFT 10,136 Reputation points
1 answer

How do we add user to synapse analytics workspace ?

Hi, How do i add AAD user to a specific Synapse Analytics Workspace. At the moment it is showing only SQL Pool Admin, Spark Admin and Workspace Admin. I want to just add end user to a workspace so that they can work with it. In Azure Databricks we…

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,621 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,045 questions
asked 2020-07-13T14:14:27.267+00:00
Anbu.Dhanushkodi 1 Reputation point
commented 2020-07-23T07:49:05.83+00:00
Rishabh 11 Reputation points