Introduction
A data pipeline is a series of data processing steps. Raw data is ingested from data sources, then it undergoes data processing and transformation, and finally it's stored for analysis.
Delta Live Tables (DLT) is a framework that simplifies the construction and management of data pipelines for big data and machine learning applications. You define the data transformations using SQL or Python to perform on your data and DLT manages task orchestration, monitoring, data quality, and error handling.
DLT has several features for streamlining data engineering tasks and for enhancing data infrastructure reliability. You can manage data quality with Delta Live Tables expectations directly in your pipelines. DLT expectations are dataset declarations that apply data quality checks on each record passing through a query. DLT also has features like lineage tracking, and performance optimizations.