Error when triggering Databricks notebook via datafactory while reading XML file

Question

I am reading xml file in spark databricks and below is the command I am using to read the file.

val readxml = spark.read.format("xml").option("rowTag","rty").option("inferschema","true").load("fielpath/"+ ip +"")

This works fine when I run the code in the databricks notebook.but when I try to run this code via adf pipeline, I am getting the below error.

java.lang.ClassNotFoundException: Failed to find data source: xml. Please find packages at http://spark.apache.org/third-party-projects.html

I have already attached and installed the the xml library in the cluster and I also ensured that the scala version is same in the Databricks linked service and the cluster.

Could someone please assist.
Thank you.

Answer

Hi Himanshu,

I was able to resolve this issue by adding the missing library in the notebook activity (Append library).

Thank you

Share via

Error when triggering Databricks notebook via datafactory while reading XML file

1 answer

Your answer