ADF Copy activity using PolyBase fails to load "Date" type to Synapse

Robert Kostecki 6 Reputation points
2020-09-24T14:46:46.05+00:00

When loading Date type from a parquet file to Synapse with "Date" type via ADF Copy with PolyBase, the ADF throws the following error:

{
"errorCode": "2200",
"message": "ErrorCode=FailedDbOperation,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Please make sure SQL DW has access to ADLS Gen2 account,Source=Microsoft.DataTransfer.ClientLibrary,''Type=System.Data.SqlClient.SqlException,Message=HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: ClassCastException: ,Source=.Net SqlClient Data Provider,SqlErrorNumber=106000,Class=16,ErrorCode=-2146232060,State=1,Errors=[{Class=16,Number=106000,State=1,Message=HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: ClassCastException: ,},],'",
"failureType": "UserError",
"target": "DeltaCopyToStaging",
"details": []
}

To replicate:

  1. Execute the following script in Synapse: CREATE TABLE dbo.[Date_Issue_PoC]
    (
    [ContactExpiryDate] [date] NULL
    )
    WITH
    (
    DISTRIBUTION = ROUND_ROBIN,
    HEAP
    ) INSERT INTO dbo.Date_Issue_PoC (ContactExpiryDate) VALUES ('2013-04-19')
  2. Export data from the table to parquet file in ADLS Gen2 using ADF Copy Activity (Synapse => Parquet file).
  3. Import data from parquet file generated in step to back to the table using ADF Copy Activity and setting the PolyBase method.

Can you provide a resolution or a work-around please?

63119

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,631 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,029 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. HimanshuSinha-msft 19,376 Reputation points Microsoft Employee
    2020-09-24T22:49:14.997+00:00

    Hello ,
    Thanks for posting the question and welcome to Microsoft Q & A .

    I was able to repro the issue and I am trying to get some help on that . One thing which I found out if you update the
    datatype from date to datetime the ADF pipeline works fine .
    Update

    CREATE TABLE dbo.[Date_Issue_PoC]
    (
    [ContactExpiryDate] [date] NULL
    )
    WITH
    (
    DISTRIBUTION = ROUND_ROBIN,
    HEAP
    )

    to

    CREATE TABLE dbo.[Date_Issue_PoC]
    (
    [ContactExpiryDate] [datetime] NULL
    )
    WITH
    (
    DISTRIBUTION = ROUND_ROBIN,
    HEAP
    )

    Thanks Himanshu

    Please do consider to click on "Accept Answer" and "Up-vote" on the post that helps you, as it can be beneficial to other community members

    0 comments No comments

  2. Robert Kostecki 6 Reputation points
    2020-09-25T17:15:07.9+00:00

    Hi Himanshu,

    Thank you for looking into this - unfortunately I cannot confirm that your workaround works. I have changed the type to datetime and it still fails to load the parquet "date" type (please see the screenshot below).

    Am I correct to assume that after changing the type to datetime you have regenerated the file thus overriding the original parquet "date" with "datetime"? If so then I am afraid it does not answer the question how to load parquet file with "date" type.

    NB In real life scenario the parquet file comes from an (external) source system over which we have no control (i.e. we cannot simply go to the source and amend the type).

    I will be looking forward to your solution and in the meantime I attach results from my test of the suggested workaround.

    28461-loading-date-type.png

    0 comments No comments