Lambda Architecture implementation using Microsoft Azure

This TechNet Wiki post provides an overview on how Lambda Architecture can be implemented leveraging Microsoft Azure platform capabilities. It is subjected to further community refinements & updates based on the availability of new features & capabilities from Microsoft Azure.

 


Introduction

Gone are those days when Enterprises will wait for hours and days to look at the dashboards based on the old, stale data. In this fast world of BYOD, fitness gears and flooding of other devices, it is becoming super important  to derive out “actionable” information from huge volume of data / noise that is generated from these devices or any other data sources and act proactively on them  in real-time, to stay competitive.

At the same time, the need for having dashboards and other analytical capabilities based on the quality, cleansed, processed data still very much exists.

With the emergence of more data types and need to handle huge volume, shift is happening from the conventional data warehouse practice to cloud based data processing & management capabilities where high volume batch processing is possible at the optimized cost. Business scenarios demanding the need to process the data in real-time for subsequent actions makes things complex.

From the various available Architectural patterns for data processing & management, Lambda Architecture stands out first,  where it aims to address the business scenarios demanding the need for processing huge volume of data both in batch and real-time.

 


Lambda Architecture – Snapshot

Objective of Lambda Architecture is to leverage the combined power of both batch & real-time processing to address the business scenarios where it requires both historic view of the data as well as getting insight into the data in real-time as business happens.

 


Lambda Architecture - logical layers

The logical layers of the Lambda Architecture includes:

Batch Layer

The batch layer precomputes results using a distributed processing system that can handle very large quantities of data. The batch layer aims at perfect accuracy by being able to process all available data when generating views.

 

Speed Layer

The speed layer processes data streams in real time and without the requirements of fix-ups or completeness. This layer sacrifices throughput as it aims to minimize latency by providing real-time views into the most recent data. Essentially, the speed layer is responsible for filling the "gap" caused by the batch layer's lag in providing views based on the most recent data. 

 

Serving Layer

Output from the batch and speed layers are stored in the serving layer, which responds to ad-hoc queries by returning precomputed views or building views from the processed data.

 


Business Benefits

Lambda Architecture is envisioned to provide following business benefits:

  • Business Agility – React in real-time to the changing business / market scenarios
  • Predictability – predict from human behaviors to machines / devices lifetime patterns and make proactive informed decisions , ensure high level of services uptime and hence the good will.

 


Candidate business scenarios

When you come across any of the scenarios similar to the one listed below, Lambda Architecture can be considered to address those scenarios:

  • Need to track the GPS enabled devices and send notifications / trigger actions in the device based on locations in real-time – say if a car/cab installed with a GPS device moves outside of the local city boundaries and to inform customer on the possible rate change. Later, do an analysis of how many of your cabs goes beyond local city boundaries and what are places to which cab are hired during weekdays, weekends etc. 
  • Continuously monitoring sensors in a production line for changes in key parameters such as temperature and in real-time trigger the sensors to stop in case of threshold breaches; perform a predictive modeling of which categories of sensors are likely to breach thresholds frequently based on models / makes , by leveraging previously archived historical data 
  • Dynamically changing the price of the items sold through the vending machines deployed across multiple remote locations based on available stock, predicted sales, seasons, location specific events, etc.
  • Continuously monitor the driving patters of the drivers and accordingly advise them on probability levels of facing an accident, anticipated wear & tear of their car’s accessories / parts.

 


Capability enablers from Microsoft Azure

When it comes to Lambda Archtiecture realization based on a public cloud, Microsoft Azure provides various capabilities that can be leveraged together for the implementation.

 

The picture depicted above provides a high level mapping between some of the Azure capabilities and various layers of Lambda Architecture.

The below table provides a mapping between logical layers of Lambda Architecture and Azure capabilities:

Layer Description Azure Capabilities
Batch Layer Stores master dataset , high latency , horizontal scalable

Data will get appended and stored (Batch View)

Azure HDInsight , Azure Blob storage
Speed Layer Stream processing of data , stored limited data, dynamic computation

Processed in real-time and stored for both read & write operations (real-time view)

Azure Stream Analytics , Azure HDInsight Spark
Serving Layer Queries batch & real-time views , merge results

Indexes batch views / out of date results

Power BI

Advantages of leveraging Azure for Lambda Architecture

  • Security – no compromise on the data security ; provides security for both data in rest and flight
  • Flexibility – You have flexibility to use open source capabilities such as spark , hive , Sqoop etc. on Azure and continue leveraging your hard earned skill
  • Supportability – supports heterogeneous platforms based devices and various protocols & industry standards
  • Adaptability – when a particular protocol is not supported , you have freedom to have protocol adapter to standardize to a particular protocol
  • Optimized cost – pay as you go
  • Reliability – Azure SLAs are defined at services levels and transparent  
  • Supports all IoT communication patterns – Telemetry , Inquiries , Commands & Notifications