Error "Cannot read properties of undefined (reading '0')"

Chuck Roberts 105 Reputation points
2024-07-26T10:25:18.1033333+00:00

I'm using Azure Data Factory online via a browser. I'm fairly new to ADF and have yet to complete any ADF tutorial due to various problems I've encountered.

I'm doing a tutorial on ADF from Udemy. I'm running a pipeline to copy data from a "landing" container to a "raw" container. This pipeline works fine. I have started a Dataflow debug as I have several data flows I have to run for different stages in the pipeline. All the files are in the "raw" container, they are not zero length files. Files are various types like .parquet and .txt (which are csv files).

The dataflow I have to run is "df_raw_to_cleansed" which cleans up the data a bit. In this dataflow, when I go to the source of the orders.parquet file (the very first step in this data flow), and do Data Preview I get this error:

Cannot read properties of undefined (reading '0')

Since I'm still fairly new, and this pipeline ran before, I don't know what could be wrong here. Using AI assistance on this forum I found that the files maybe be missing (they are there), or they may be corrupted. I can check for corrupted .txt files by looking inside them. But how do I check for corruption into a .parquet file?

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,643 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Amira Bedhiafi 24,181 Reputation points
    2024-07-26T20:35:25.5333333+00:00

    Based on this old thread :

    I am thinking, probably the issue is not with the pipeline parameters or variables. Wondering , if you have Rule-Based Mapping in any of your DataFlow activity. In the rule-based mapping, you can define columns "name", "type", "stream", "origin", "position" etc. Would you please check the expression (if you have any), there is possibility that it could caused this error.

    Also, to check for corruption in a .parquet file, you can use tools like Apache Parquet CLI or read the file using a Python script with libraries like pyarrow or pandas :

    
    import pyarrow.parquet as pq
    def check_parquet(file_path):
    
    try:
    
        table = pq.read_table(file_path)
    
        print(table)
    
        return True
    
    except Exception as e:
    
        print(f"Error reading parquet file: {e}")
    
        return False
    
    # Example usage
    file_path = 'path/to/your/file.parquet'
    is_valid = check_parquet(file_path)
    print(f"File valid: {is_valid}")
    

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.