Architectures overview
Before you start to build out the data architectures of your cloud-scale analytics framework, review the articles in the following table.
Section | Description |
---|---|
Build an Initial Strategy | How to build your data strategy and pivot to become a data driven organization. |
Define your plan | How to develop a plan for cloud-scale analytics. |
Prepare analytics estate | Overview of preparing your cloud-scale analytics estate with key design area considerations like enterprise enrollment, networking, identity and access management, policies, business continuity and disaster recovery. |
Govern your analytics | Requirements to govern data, data catalog, lineage, master data management, data quality, data sharing agreements and metadata. |
Secure your analytics estate | How to secure analytics estate with authentication and authorization, data privacy, and data access management. |
Organize people and teams | How to organize effective operations, roles, teams, and team functions. |
Manage your analytics estate | How to provision platform and observability for a scenario. |
The physical implementation of cloud-scale analytics consists of two main architectures: the data management landing zone and data landing zone.
Data applications are a core concept for delivering a data product and can be aligned to both lakehouse and data mesh patterns.
You can scale your cloud-scale analytics deployment by using multiple data landing zones.
Implement data mesh by using cloud-scale analytics. Although most cloud-scale analytics guidance applies, there are some differences to be aware of for data domains, self-serve data platforms, onboarding data products, governance, data marketplace, and data sharing.
The following table lists reference templates that you can deploy.
Repository | Content | Required | Deployment model |
---|---|---|---|
Data management template | Central data management services and shared data services like data catalog and self-hosted integration runtime | Yes | One per cloud-scale analytics |
Data landing zone template | Data landing zone shared services, including ingestion, management, and data storage services | Yes | One per data landing zone |
Data integration template - batch processing | Additional services necessary for batch data processing | No | One or more per data landing zone |
Data integration template - stream processing | Additional services necessary for data stream processing | No | One or more per data landing zone |
Data product template - analytics and data science | Additional services necessary for data analytics and AI | No | One or more per data landing zone |
These templates contain Azure Resource Manager templates, the templates' parameter files, and CI/CD pipeline definitions for resource deployment.
Templates can change over time due to new Azure services and requirements. Secure each repository's main branch so it remains error-free and ready for consumption and deployment. Use a development subscription to test template configuration changes before you merge feature enhancements back into your main branch.
The reference architecture is secure by design. It uses a multilayered security approach to overcome common data exfiltration risks.
The most simple security solution is to host a jumpbox on the virtual network of the data management landing zone or data landing zone to connect to the data services through private endpoints.
For a list of questions and answers about cloud-scale analytics, see Frequently asked questions.