Write json document to azure table

Raj D 586 Reputation points
2020-08-21T05:07:09.597+00:00

Hi, I am using below code to write json document in a Azure Data Lake Gen2 container into a SQL Server table.

Code:

    df = spark.read.format('json').load("wasbs://<file_system>@<storage-account-name>.blob.core.windows.net/empDir/data")
    val blobStorage = "<blob-storage-account-name>.blob.core.windows.net"
    val blobContainer = "<blob-container-name>"
    val blobAccessKey =  "<access-key>"
    val empDir = "wasbs://" + blobContainer + "@" + blobStorage +"/empDir"
    val acntInfo = "fs.azure.account.key."+ blobStorage
    sc.hadoopConfiguration.set(acntInfo, blobAccessKey)
    val dwDatabase = "<database-name>"
    val dwServer = "<database-server-name>"
    val dwUser = "<user-name>"
    val dwPass = "<password>"
    val dwJdbcPort =  "1433"
    val dwJdbcExtraOptions = "encrypt=true;trustServerCertificate=true;hostNameInCertificate=*.database.windows.net;loginTimeout=30;"
    val sqlDwUrl = "jdbc:sqlserver://" + dwServer + ":" + dwJdbcPort + ";database=" + dwDatabase + ";user=" + dwUser+";password=" + dwPass + ";" + dwJdbcExtraOptions
    val sqlDwUrlSmall = "jdbc:sqlserver://" + dwServer + ":" + dwJdbcPort + ";database=" + dwDatabase + ";user=" + dwUser+";password=" + dwPass
    spark.conf.set("spark.sql.parquet.writeLegacyFormat","true")
    df.write.format("com.databricks.spark.sqldw").option("url", sqlDwUrlSmall).option("dbtable", "Employee").option( "forward_spark_azure_storage_credentials","True").option("tempdir", empDir).mode("overwrite").save()

Error Message:

    com.databricks.spark.sqldw.SqlDWConnectorException: Exception encountered in SQL DW connector code.
    ---------------------------------------------------------------------------
    Py4JJavaError                             Traceback (most recent call last)
    <command-1454693370047038> in <module>
          1 spark.conf.set("spark.sql.parquet.writeLegacyFormat", "true")
    ----> 2 df.write.format("com.databricks.spark.sqldw").option("url", "sqlDwUrlSmall").option("dbtable", "SampleTable").option("forward_spark_azure_storage_credentials","True").option("tempdir", dir).mode("overwrite").save()

Thank you

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,410 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,047 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Raj D 586 Reputation points
    2020-08-21T11:41:04.557+00:00

    Hi PRADEEPCHEEKATLA-MSFT, Thanks very much for your response. Please find complete stack trace.

    com.databricks.spark.sqldw.SqlDWConnectorException: Exception encountered in SQL DW connector code.

    Py4JJavaError                             Traceback (most recent call last)
    <command-1454693370047038> in <module>
          1 spark.conf.set("spark.sql.parquet.writeLegacyFormat", "true")
    ----> 2 df.write.format("com.databricks.spark.sqldw").option("url", sqlDwUrlSmall).option("dbtable", "Employee").option( "forward_spark_azure_storage_credentials","True").option("tempdir", empDir).mode("overwrite").save()
    /databricks/spark/python/pyspark/sql/readwriter.py in save(self, path, format, mode, partitionBy, **options)
        735             self.format(format)
        736         if path is None:
    --> 737             self._jwrite.save()
        738         else:
        739             self._jwrite.save(path)
    /databricks/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py in call(self, *args)
       1255         answer = self.gateway_client.send_command(command)
       1256         return_value = get_return_value(
    -> 1257             answer, self.gateway_client, self.target_id, self.name)
       1258
       1259         for temp_arg in temp_args: