Synapse spark-avro module is external and not included in spark-submit

Robson Soares 1 Reputation point
2020-08-19T13:34:26.577+00:00

Hi.

I have processing avro files from data lake store gen2 using databricks and it worked fine.

Now I am testing the same scenario usinf spark pool of Synapse Analytics. but spark-avro module is external and not included in spark-submit.

How can I install those packages and the respective dependencies on Synapse Spark cluster?

Best regards

Robson

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,631 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. PRADEEPCHEEKATLA-MSFT 84,531 Reputation points Microsoft Employee
    2020-08-20T09:56:36.383+00:00

    Hello @Robson Soares ,

    Welcome to the Microsoft Q&A platform.

    Libraries provide reusable code that you may want to include in your programs or projects. To make third party or locally-built code available to your applications, you can install a library onto one of your Spark Pools (preview). Once a library is installed for a Spark pool, it is available for all sessions using the same pool.

    This article provides different ways to manage libraries for Apache Spark in Azure Synapse Analytics.

    Hope this helps. Do let us know if you any further queries.

    ----------------------------------------------------------------------------------------

    Do click on "Accept Answer" and Upvote on the post that helps you, this can be beneficial to other community members.