Which is the best option for data warehouse / Power BI solution?

System2000 236 Reputation points
2021-10-19T10:02:24.153+00:00

Hello,

I am building a data warehouse in Azure from several external data sources. Data can be imported via pipelines in Data Factory (or Azure Synapse). Data is to be stored in Azure. Reporting through Power BI is required.

There seem to be a multitude of methods to achieve this. I could store the data in an Azure SQL database, or use blob storage. I could use an Azure Data Lake. I could use Azure Analysis Service or Azure Synapse Analytics to integrate with Power BI.

For example, as I understand it, I could do something this:
External DBs-->Azure Data Factory-->Azure Blob Storage/Azure Data Lake-->Azure Synapse Analytics-->Power BI Reports

But it might be better (or worse) to do this:
External DBs-->Azure Data Factory-->Azure SQL Database-->Azure Analytics Service-->Power BI Reports

I am looking for advice on why I would use one particular approach rather than the other.

Thanks.

Azure Analysis Services
Azure Analysis Services
An Azure service that provides an enterprise-grade analytics engine.
443 questions
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,630 questions
{count} votes

Accepted answer
  1. System2000 236 Reputation points
    2021-10-28T20:31:34.033+00:00

    I'm accepting my own comment as answer to close this question.


5 additional answers

Sort by: Most helpful
  1. HimanshuSinha-msft 19,376 Reputation points Microsoft Employee
    2021-10-19T21:36:07.29+00:00

    Hello @System2000
    Thanks for the ask and using Microsoft Q&A platform .
    I am in agreement with you as there are lot of option and you may choose one over the other . I am not sure about which third party External DBs you are fering to here . PowerBI by iteslf supports lot of external datasource and I think you should check that out and if the you Ext DB are supported may be you can achieved lot by using PowerBI itself .
    https://video2.skills-academy.com/en-us/power-bi/connect-data/desktop-data-sources.
    At this time PowerBI does support many of the AAS feature and unless there is any feature which is avaible in AAS only and you are planning to use that , I will suggest you stick to PowerBI .

    Please do let me know how it goes .
    Thanks
    Himanshu

    -------------------------------------------------------------------------------------------------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    0 comments No comments

  2. System2000 236 Reputation points
    2021-10-20T09:01:01.673+00:00

    Hi @HimanshuSinha-msft ,

    Thanks for your answer, but I think you've misinterpreted my question. I am mainly interested in the part in the middle, between ingestion of data and Power BI reporting.

    I can handle the ingestion through Data Factory pipelines (or Azure Synapse Analytics pipelines), but I have multiple options on whether to use Data Lake, Azure SQL databases, blob storage, etc - and secondly I have several ways I can get the data to Power BI e.g. through Azure Synapse Analytics, AAS, etc. I don't know which options to use, or why.

    This is my current preferred approach:
    Several External MS SQL DBs --> ADF --> CSV/blob --> Data Lake --> Azure Synapse Analytics --> Power BI

    Would you say I should be doing it differently? e.g. you might say "you don't need blob or Data Lake, just use an Azure SQL DB, and then you could use AAS instead of Synapse." Or would you say my approach is fine? Cost of Azure services is not an issue, unless significantly different.

    Thanks again.

    0 comments No comments

  3. System2000 236 Reputation points
    2021-10-20T13:15:08.05+00:00

    Hi @HimanshuSinha-msft ,

    As a follow up to my previous post, here are some examples of data warehouse architecture that I found on microsoft.com:

    https://video2.skills-academy.com/en-us/azure/architecture/reference-architectures/data/enterprise-bi-synapse

    https://video2.skills-academy.com/en-us/azure/architecture/example-scenario/data/data-warehouse

    https://video2.skills-academy.com/en-us/azure/architecture/solution-ideas/articles/enterprise-data-warehouse

    There are probably more examples, but in each of the above the architecture is different. One is using blob storage, another is using data lake, another is using Azure Synapse Analytics as storage. I'd like to commit to an architecture, but it's difficult to know which.

    P.S. I appreciate your help.

    0 comments No comments

  4. System2000 236 Reputation points
    2021-10-26T15:53:53.403+00:00

    Ok. I decided how I'm going to do it. I don't know if this would count as an answer, so I'm leaving it open until someone either agrees or disagrees, or enough time passes without reply.

    • I'm going to use Azure Synapse pipelines to ingest data from SQL Server databases and other data sources (utilising integration runtimes for on-prem DBs).
    • I'm going to store this data in an Azure Data Lake Gen 2. I believe I will need to convert SQL Server data tables to CSV in the ingestion process in order to save the data in the data lake. This will likely be the staging area for the data.
    • I'm going to employ an Azure Synapse SQL pool, although I'm not sure whether serverless or dedicated yet
      --- if dedicated pool, then I will manipulate data from the data lake through a Synapse pipeline data flow and save it in the data warehouse for reporting.
      ---- if serverless, I will create a logical data warehouse and use views to serve the report data
    • I will link a Power BI workspace(s) to the Azure Synapse workspace, create Power BI reports through Power BI desktop and publish them to Power BI online. From there, the reports can be shared to the organisation.

    Does this sound like a good plan? Anyone?

    0 comments No comments